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ABRAHAM WALD, 1902-1950 
By J. Wo.row11Tz 
Cornell University 


In November, 1950, Abraham Wald left the United States, on leave from 
his post as Professor of Mathematical Statistics at Columbia University, for an 
invited tour of Indian universities and research centers. Accompanying him 
was Mrs. Wald. His book on statistical decision functions had recently been 
published, and he intended to teach the new theory to Indian statisticians. 
On December 13, 1950, an Air India plane, lost in a fog, crashed into a peak of 
the Nilgiris, killing all aboard, among them Professor and Mrs. Wald. Thus, 
cut off in the prime of activity, died this great statistician, whose work had 
changed the whole course and emphasis of modern statistics. The personal loss 
will be felt by his numerous friends, but all must mourn for the statistical dis- 
coveries yet unmade which were buried in the flaming wreckage on a mountain 
side in South India and which will slowly and painfully have to be made by 
others. 

1. Abraham Wald was born in Cluj, Rumania, on October 31, 1902. His 
father was a small business man, but there was an intellectual atmosphere in 
the family. His grandfather was a famous rabbi, and the father had considerable 
intellectual interests. There were five other children in the family, and one 
brother, Martin, was considered as intellectually gifted as Abraham. Martin 
was an electrical engineer with many inventions to his credit, and Abraham 
rendered mathematical help to his brother in a few of the latter’s researches. 
Wald’s sisters, inventor brother, their spouses and children, his parents and 
other relatives, died in German crematoria and concentration camps. One 
brother only survived and is now in the United States. 

Wald was not admitted to the local gymnasium because, as the son of an 
orthodox Jew, he would not attend school on Saturday, the Jewish Sabbath. 
He studied by himself and was admitted to the University of Cluj. After gradua- 
tion from the local university he experienced considerable difficulty in entering 
the University of Vienna because of religious restrictions. He spent a year in 
the engineering school at Vienna, but finally was admitted to study mathematics 
at the University of Vienna. 

At the University of Vienna Wald became acquainted with Menger and 
Hahn. Both soon recognized Wald’s abilities and the former became his lifelong 
friend. Wald became a frequent contributor to, and assistant editor of, the 
regularly issued reports of the proceedings of Menger’s colloquium, from which 
new results in mathematics issued steadily. Wald’s initial work in mathematics 
was in geometry, his thesis dealing with a question of axiomatics. 

Editorial Note: The first three papers published in this issue were presented at the 


Abraham Wald Memorial Session held September 7, 1951, at the Minneapolis meeting 
of the Institute. 
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In spite of his gifts, an academic post in Austria was practically barred to a 
man of Wald’s background. Menger therefore advised Wald to work also in applied 
mathematics. He also put him in touch with a banker named Karl Schlesinger, 
who was interested in mathematical economics and to whom Wald gave lessons 
in mathematics. In his quest for a source of income and an opportunity to work 
in applied mathematics, Wald met Oskar Morgenstern, who was then director 
of the Institut fiir Konjunkturforschung. Morgenstern appreciated Wald’s 
talents and increasingly employed Wald in his institute. Wald’s book }24}° was 
one of the products of this employment. Like Menger, Morgenstern became a 
lifelong friend of Wald’s. All three eventually emigrated to the United States. 
Morgenstern later became a coauthor with von Neumann, of their famous 
book [1]' on the theory of games. Wald’s most important work (whose basis 
was laid in }37{ completely independently) was to have many points of contact 
with the theory of games, which in itself became one of Wald’s later mathe- 
matical interests. 

In addition to other papers in mathematical economies (e.g., }21}, }23}, }27}, 
}28}, {30}) which he wrote in Vienna, Wald also worked on the problem of 
consistency of the concept of a “Wollektiv” (}20{, $29), }31}). As the problem 
was put to Wald and as he solved it, it was a difficult and noteworthy achieve- 
ment. It was an important step in von Mises’ axiomatization of probability 
and is often cited for this reason. However, the problem was difficult chiefly 
because of the way it was put. A simple consequence of the strong law of large 
numbers for identically distributed and independent variables already has as a 
consequence that almost every sequence of observations is a Kollektiv. Thus the 
modern measure-theoretic approach to the axiomatization of probability theory 
does away with the need for this pretty piece of work by Wald. 

2. Wald came to the United States in the summer of 1938 as a fellow of the 
Cowles Commission for Research in Economics. In the fall of that year he was 
released by the Cowles Commission to accept a fellowship of the Carnegie 
Corporation which was obtained for him by Harold Hotelling, then at Columbia. 
Hotelling was already then one of the leading American teachers of the modern 
theory of statistics. His was one of the few voices in the wilderness proclaiming 
the importance of the new subject. Wald spent a very busy year learning modern 
statistics by reading and attending Hotelling’s lectures. He also began that steady 
stream of papers in statistics which was never to cease until his death. It was in 
that academic vear (1938-39) that what is probably his most important paper 
137! was written, at a time when his knowledge of statistics was rather limited. 
Wald labored prodigiously, and most of his waking moments during this and the 
next several vears were given to work. 

When Wald took up residence at Columbia he knew little of what was then 
modern statistics or of the content of the courses he later taught. The great 
difficulty in learning statistics then was due to the obscure manner in which 

1 References in brackets (e.g. [5}) are listed at the end of this paper; references in braces 
e.g. |26}) are listed in ‘‘The publications of Abraham Wald,” pp. 29-33 of this issue. 
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much of the statistical literature was written. In spite of this, Wald mastered 
the subject so that the lectures which he gave at Columbia in 1939-40 were 
noted for their lucidity and mathematical rigor. Students not only flocked to 
them, but clamored for a record of them. Ralph J. Brookner, who later obtained 
his doctor’s degree under Wald, took notes of the lectures. These were later 
reproduced for circulation among the students only. However, their fame had 
spread widely and requests for their purchase came from all over the country. 
Wald was always anxious to restrict their circulation. He did not want them 
considered as books (which they certainly were not) and reviewed as such. 
However, his good nature often prevented strict enforcement of the rule against 
circulating the notes outside of Columbia University. Thus many of the new 
generation of American statisticians learned the theory of the analysis of variance 
from the notes of his course. 

These notes on the analysis of variance are typical of Wald’s notes and many 
of his writings. They are rigorous, accurate, and clear, but some of the proofs 
are clumsy, and the organization of the notes could be improved. The original 
lectures were given under great time pressure, and there was no time to search 
for the most elegant proofs, or to plan the organization of the course long in 
advance. Wald seldom bothered to rework his writings for mathematical elegance 
or clarity—only new results interested him. Thus the original notes were allowed 
to stand without alteration, although he would depart from them in his class 
lectures. 


As a lecturer, Wald was clear and lucid. His proofs were always carefully 
organized, his hypotheses carefully stated. Without omitting any essential 
detail he would wend his way logically and inexorably from hypothesis to con- 
clusion. He seldom gave an intuitive justification of the theorems, probably 
because he himself needed it so little. 


His lectures in 1939-40 marked the beginning of his teaching career at Colum- 
bia, which was terminated only by his death. Hotelling labored prodigiously to 
find him a permanent post at Columbia and the presence of Hotelling and Wald 
made Columbia a foremost center of mathematical statistics. Wald stayed at 
Columbia as a fellow of the Carnegie Corporation until 1941, when he was 
made assistant professor of economics. Outside recognition won him promotions 
to an associate professorship in 1943 and then to a professorship in 1944. When 
Hotelling left Columbia in 1946, a department of, mathematical statistics was 
formed at Columbia with Wald as professor of mathematical statisties and its 
head. His fame was at its height and students came from all over the world to 
hear him. 

3. Wald’s greatest achievement was the theory of statistical decision functions, 
which includes almost all problems which are the raison d’étre of statistics. 
Every science develops its own techniques, and the development of techniques 
often gives rise to difficult problems. However, we must never lose sight of the 
fact that, for example, the solution to a distribution problem, no matter how 
difficult, is only a device for enabling us to answer some statistical question, and 
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is not of statistical importance per se. Wald brought to statistics a very high 
degree of mathematical ability and knowledge. Along with this, and in spite of 
his abstract and theoretical bent and predilections, he never, in any statistical 
investigation, lost sight of the fact that there was a question to be answered 
and a decision to be made. Practical people, in many cases dazzled by a mathe- 
matical approach they did not well understand, often did lose sight of the final 
goal, and submitted to having their problems forced into a framework into which 
they did not really fit. Wald never did this, and much of his statistical success 
was due to this fact. Wald not only posed his statistical problems clearly and 
precisely, but he posed them to fit the practical problem and to accord with the 
decisions the statistician was called on to make. This, in my opinion, was the 
key to his success—a high level of mathematical talent of the most abstract 
sort, and a true feeling for, and insight into, practical problems. The combination 
of the two in his person at such high levels was what gave him his outstanding 
character. Appropriately enough, his greatest achievement was the direct result 
of this combination. 

What was probably Wald’s most important paper ({37}) was written before 
he knew the details of modern statistical theory. Most of the important notions 
of his theory of decision functions are already present in this paper. It is true 
that observations are not taken seriatim but in one sample; this, however, is 
not too important. The notions of the decision space, of the weight and risk 
function, of a minimax solution, are all present. Wald proved that the risk 
function of a minimax solution is constant (under certain restrictions). He 
operated daringly with Bayes solutions for a priori distributions. At this time 
statisticians recoiled in horror from Bayes solutions, due to their earlier misuse, 
and under Fisher’s great authority (rightly exercised, I think). Wald made 
use of Bayes solutions purely as a mathematical tool and without invoking 
any objectionable statistical connotations. At this time he already had the 
notions of an admissible test and of a least favorable a priori distribution. It is 
very saddening to leaf through the pages of this paper and to realize that its 
author is gone from us. 

The paper went almost completely unnoticed. At this time there were few 
statisticians with the mathematical competence to read the paper. The use of 
Bayes solutions was a deterrent. Wald did not reaily emphasize that he was 
using Bayes solutions only as a tool. His ideas were startlingly new and far off 
the beaten track. Thus a reviewer of his paper (Zentralblatt fiir Mathematik und 
thre Grenzgebiete, Vol. 55 (1941), p. 55) said: “... Es ist gerade der grosse Fort- 
schritt gewesen, dass J. Neyman und E. 8. Pearson im Gegensatz zu Th. Bayes 
ohne zusitzliche Annahmen ausgekommen sind. Die FEinfiihrung der beiden 


Hilfsfunktionen [i.e., the weight function and the a priori distribution function] 
wirkt demgegeniiber wie ein Riickschritt. Die Satze des Verfassers bleiben 
leere Theorie, die kurzen Beispiele sprechen keineswegs fiir die praktische 


’ 


Bedeutung seiner Ansitze.’ 
Wald’s personality was also a factor in the reception which his new ideas 
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received. He made no effort to popularize his ideas or to make them accessible 
to a less mathematical public. I remember well conversations I had with him 
on the Columbia campus in the spring of 1939. I had recently begun the study 
of statistics, and tried toe oppose his arguments with what I had recently learned. 
As he convinced me and I grew more and more enthusiastic over what I had just 
heard from him, Wald proposed several problems for us to work on together. 
These were all of the most abstract sort, and concerned weakening restrictions 
under which he proved the theorems of his paper. For example, he was irked by 
the restriction of,compactness and wished to weaken it. Clearly such a program 
of research was not calculated to popularize his theory. What was required to 
do so was to apply the theory to outstanding problems. This incident illustrates 
Wald’s personality—always ready to talk about mathematics, but uninterested 
in popularization and special applications. He was practical-minded in that he 
always kept statistical ends in sight when working in statistical theory. When 
the latter was finished to his satisfaction he was not interested in its special 
application to practical problems. 

Wald did not resume serious work on decision functions until 1946. When he 
did begin work again on decision functions he was also spurred on by the con- 
nection between the newly announced results of [1] and his own theory, and by 
the general interest among economists and others aroused by the theory of 
games. 

In the most general formulation of Wald’s theory in |94} one deals with a 
sequence X, , X2, --- of (not necessarily independent) chance variables, about 
whose joint distribution F the statistician in the beginning knows nothing except 
that F is a member of a given class 2. There is given a space D of decisions d, 
one of which the statistician has to make. The loss due to decision d when F is 
actually the distribution is W(F, d), where W is a given function (the “weight 
function”). (In {101} the loss is allowed to depend also upon the sequence of 
observations.) The total loss is the sum of the loss due to the decision made 
and the cost of the observations. A statistical decision function is a rule which 
at the 7th stage (¢ = 1, 2, --- , the first stage is at the outset of the experiment 
before any observations have been taken) tells the statistician whether or not 
to take further observations (at the first stage, whether to take any observations), 
on which chance variables to take observations (if at all), and which decision to 
make (if no further observations are to be taken). At each stage the decision 
function is a function of the preceding observations, and is a probability dis- 
tribution function over the various available possibilities. The actual decision 
is made by an independent chance mechanism governed by this distribution. 
The risk function (a function of F) is the expected value of the total loss. Natu- 
rally it depends upon the statistical decision function adopted; different decision 
functions in general have different risk functions. 

Comparison of two decision functions is made on the basis of their risk func- 
tions. Let r:(F) and r:(F) be the risk functions of decision functions 6; and 4: , 
respectively. If r:(F) < re(F) for all F in Q, 6; is said to be at least as good as 
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6. . If the inequality sign holds for at least one F 6, is said to be uniformly better 
than 6,. A class C of decision functions is said to be complete (essentially com- 
plete) if for any decision function 6 not in C there exists a decision function 


6’ in C such that 6’ is uniformly better than (at least as good as) 6. 
If a probability measure — on the space Q is given, then the decision function 


6 which minimizes 
[uw dé 


is called a Bayes solution (for &). It used to be thought essential to assume some 
definite €, but the modern point of view rejects this notion. The importance of 
Bayes solutions for Wald lay in the fact that the totality of all Bayes solutions 
(or a class derived from this totality) constitutes, under certain circumstances, 
a complete or essentially complete class. A minimax decision function is one 
for which 


sup r3(F) 
PF 


is a minimum. A least favorable a priori distribution £ is one for which 


inf / r3(F) dé 
5 
is a maximum. If r;(/’) be regarded as the pay-off function of a zero-sum two- 
person game between Nature and the statistician, the significance of a minimax 
decision function and of a least favorable a priori distribution, as minimax 
strategies in the game, becomes apparent. 

The main results of {94} are existence theorems and complete class theorems. 
The chief tool for proving existence is Wald’s Theorem 2.15, of which we shall 
speak below. Under his weaker restrictions Wald proves the existence of Bayes 
solutions and of a minimax solution. Under the stronger conditions he proves 
the existence of a least favorable a priori distribution. In both cases he gives 
complete and essentially complete classes in terms of Bayes solutions and their 
closures. It would be repetitious and occupy considerable space here to describe 
these results in any completeness. A relatively short and easy-to-read exposition 
is to be found in } 102}. 

The statistician who wants to apply the results of {94} to specific problems 
is likely to be disappointed. Except for special problems, the complete classes 
are difficult to characterize in a simple manner and have not yet been character- 
ized. Satisfactory general methods are not vet known for obtaining minimax 
solutions. If one is not always going to use a minimax solution (to which serious 
objections have been raised) or a solution satisfying some given criterion, then 
the statistician should have the opportunity to choose from among ‘“repre- 
sentative” decision functions on the basis of their risk functions. These are not 
available except for the simplest cases. It is clear that much remains to be done 
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before the use of decision functions becomes common. The theory provides a 
rational basis for attacking almost any statistical problem, and, when some 
computational help is available and one makes some reasonable compromises 
in the interest of computational feasibility, one can obtain a practical answer 
to many problems which the classical theory is unable to answer or answers in 
an unsatisfactory manner. However, for this purpose a relatively simple exposi- 
tion would suffice to instruct the reader in the rationale of such procedures and 
it is unnecessary for him to tackle the mathematical details of the theory. The 
principal value of Wald’s book must therefore be for research workers, and the 
practicing statistician can probably content himself with a reading of the first, 
and perhaps parts of the last, chapters. 

The book is in places not easy to read. Some of the longer arguments could, 
with some effort, be made more accessible. A number of very minor errors have 
crept in. As an example of the latter we may cite the following. A principal tool 
is Theorem 2.15. This theorem states that any sequence of probability measures 
on a compact metric space contains a subsequence which converges in the 
ordinary sense to «a probability measure. Convergence in the ordinary sense 
means convergence for every open set whose boundary has probability zero 
according to the limit distribution. Theorem 2.15 was found by Wald in 1947 
and in his book he states that it is related to a theorem of Krylov and Bogolyubov 
{2|. To describe the latter we first give the theorem of Helly-Bray. This theorem 
states that if the probability measures £; approach the probability measure 
£ in the ordinary sense, and if @(x) is any bounded continuous function, then 


(*) / (x) dix) > [ (2) dbo(2). 


The theorem of Krylov and Bogolvubovy states that, given a sequence of proba- 
bility measures on a compact metric space, there exists a probability measure 
& and a subsequence £; such that (*) holds for any bounded continuous function 
of x. Actually it may be shown that Wald’s Theorem 2.15 and the theorem of 
Krylov and Bogolyubov are identical. 

It is possible to adopt a definition of convergence of decision functions to a 
limiting decision function ({94}, pp. 65-6) which will permit a single unified 
treatment of both absolutely continuous and discrete probability distributions 
({94}, Ths. 3.1, 3.2). The elegance of the book would be enhanced by such a 
treatment. 

The book marks the end of the first chapter of work on decision functions. 
It contains many of Wald’s results to date, and in some respects the most 
general results. However, papers {87} and {90} contain results not in the book. 
The papers {91}, {97}, {99}, and {101} date after the book and contain entirely 
new results. 

The results of {90} are basic for many purposes in decision theory and deserve 
some mention. They were obtained in January, 1948, in connection with work 
on the optimum character of the sequential probability ratio test {84}, but 
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because of various delays were not published ‘until much later (see also {85}, 
The paper {84} united two currents of Wald’s thought and I shall describe it 
below when I speak of his work on sequential analysis. The result of {84} was 
obtained by studying Bayes solutions of sequential decision problems involving 


two decisions. After this paper was completed it was natural to attack the 
same problem for / decisions. It turns out, as so frequently happens in mathe- 
matics, that the proof of a fundamental convexity property, on which the 
whole proof of |84{ rests, and which was very lengthy for k = 2 ({84}, Lemma 
2), became extremely simple for general k ({90}, Theorem 3.9). From this one 
obtains partial characterizations of certain regions of decision in the space of a 
priori (and a posteriori) distributions (ibid., Th. 3.10 and Section 4). Theorem 
3.7 gives a complete characterization of Bayes solutions. Theorem 3.8.1 gives 
a strong continuity result. Equation (3.10) describes a relation which governs 
the minimum Bayes risk. The contents of Chapter 4 of Wald’s book {94} re- 
capitulate part of this paper {90}. The methods of {84} and {90} are entirely 
in the spirit of Wald’s work and their results should be regarded as achievements 
of his theory. 

Wald in his book used decision functions which can be random at each stage, 
that is, when an observation is taken the decision made depends upon the 
outcome of a random experiment. Another method of randomization is to 
randomize once for all at the beginning of the sequence of observations and then 
to proceed in a nonrandom manner. In {99} the equivalence of the two methods 
under rather general conditions is shown. 

In {97} it is proved, inter alia, that when the number of decisions and possible 
distribution functions is finite, randomization can be eliminated if the distribu- 
tions are continuous. At least in this case the role of randomization is to break 
up “atoms” of probability. The proof rests on a general measure theoretic 
result proved in {96}, an extension of Lyapunov’s theorem. In {101} one can 
see clearly the intuitive basis of Wald’s theorem that the totality of Bayes solu- 
tions is complete. Characterizations of admissible solutions are given there. 
These papers are likely to interest chiefly the research worker. 

This discussion would not be complete without a brief statement of Wald’s 
attitude toward the minimax criterion. This attitude has been widely mis- 
understood. The question concerns a criterion for choosing a decision function 
from among those in the complete class. Wald often wondered how to give a 
criterion for choosing a member of the complete class in the absence of any 
information about which member of 2 is the true distribution. One possible 
criterion seemed to him to call for the choice of an admissible minimax decision 
function. This has the advantages of being a very conservative procedure, of 
being independent of any a priori distribution on Q, and of having a constant risk 
function (under certain conditions). However, it would be wrong to assert that 
Wald strongly advocated the minimax criterion. Thus in his book {94} he states 
on page 27: “Nevertheless, since Nature’s choice is unknown to the experimenter, 
it is perhaps not unreasonable for the experimenter to behave as if Nature 
wanted to maximize the risk.’’ However, even this qualified endorsement is 
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tempered by the next sentence: “But, even if one is not willing to take this 
attitude, the theory of games remains of fundamental importance. ...’’ Wald 
was searching for other criteria, and his last joint work with this writer con- 
cerned this problem. He was dissatisfied with known results on the problem 
and had no great faith in the necessity for the minimax criterion. 

The theory of statistical decision functions is the most brilliant part of Wald’s 
work. It is a landmark in statistical theory. Ours is the era of decision functions 
and its end is nowhere in sight. 

4. Wald’s sequential probability ratio test was a great statistical achievement. 
M. Friedman and W. A. Wallis posed to Wald the problem of performing se- 
quentially a test of a hypothesis. Wald’s first step was to take the simplest test 
of all, testing the hypothesis that the frequency function of independent observa- 
tions is fi(2), against the alternative that the frequency function is fe(x), with 
prescribed maximum probabilities of error. Wald’s immediate achievements were 
twofold: (a) he conjectured that a test procedure based on constant limits for 
S,, where S, = > flog fi(xi) — log fe(x;)], would minimize the expected 
number of observations under each of f,(x) and f2(x) (optimum property of the 
sequential probability ratio test); (b) he obtained simple and excellent ap- 
proximations for these limits in terms of the prescribed probabilities of error. 
The first of these is, to my mind, a stroke of genius, and a rare and daring flight 
of intuition. It was proved only later ({84}) after a number of attempts, and by 
the methods of Wald’s decision theory and the manipulation of Bayes solutions. 
This was the paper Wald himself liked best. The excellence and simplicity of 
Wald’s approximations to the limits to be set on the cumulative sums are also 
typical of Wald. His intuition about approximations was uncanny and the 
rudest of methods in his hands struck gold. On many occasions in connection 
with other problems I would protest that the methods he proposed were too 
crude to yield good approximations, only to find that this was not so. In this 
instance of sequential analysis the availability of excellent approximations was 
extremely important for future progress. Also very important was the fact that 
Wald had properly chosen to begin with the simplest problem, because it is 
there that the results are most startling. From there he went on to problems of 
composite hypotheses, where the results were not so definitive. 

Once fired with the idea, Wald labored incessantly at sequential analysis. 
For several months he did little else. Most of his own contributions to his book 
{76} (and these constitute most of the book itself) were discovered then. They 
were first published in the volume {55}, which/was put in the “restricted” 
category, and made available only to authorized recipients. Wald chafed greatly 
under this restriction. The first paper he managed to publish was {62}. Here he 
develops certain basic theorems governing sums of random variables, the number 
of random variables being itself a chance variable n. Among these results is the 
distribution of n. The connection between Wald’s sequential analysis and the 
random walk problem was being emphasized. This was followed by more results 
of this type in {70}. The goal always was to prove the optimum character of the 
sequential probability ratio test. In {70} Wald obtained a result which implied 
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that if one neglects the excess of the cumulative sums over the prescribed limits 
then the sequential probability ratio test minimizes the expected number of 
observations under each distribution. While this is still far from the desired 
mathematical result it points up the practical value of the sequential probability 
ratio test. Even if no more than this could be said about the latter the test would 
still be valuable in practice for most cases. The results of |70{ are now largely 
obsolete, partly through the result of {84} and partly through the work of 
others. 

No proof of the optimum property of the sequential probability ratio test 
different from {84} has yet been given. In my opinion, an essentially different 
proof would be an interesting and worthwhile achievement. It is likely that 
important things might result from a comparison of two essentially different 
ideas of proof. 

Claims have been made elsewhere, chiefly in England, to the invention of 
sequential analysis. We can clarify the matter simply as follows. The notion of 
taking observations sequentially was not Wald’s. It is probably an old idea, 
although a number of people lay claim to it, no doubt having discovered it 
independently. The brilliant and difficult part is the invention of the sequential 
probability ratio test. This is solely the work of Wald, and no trace of this idea 
exists in the literature prior to Wald’s work. 

When permission was granted to Wald to publish on sequential analysis, he 
described his work in |68} and gave an elementary exposition in {69}. The 
contents of }68} do not differ much from the book {76}. The book is typical of 


Wald. Clear, lucid, most of the researches on the subject his own, it was put 
together hurriedly without too much thought of elegance or of reference to 
related fields. 

7 


Papers {72} and {73} continued his work on sequential analysis, and their 
titles tell the story of their contents. Papers }75{ and |82{ evince the interest 
in the theory of random walk to which he was drawn by his work on sequential 
analysis. They are based on a paper by Erdés and Kae [3]. Here too the titles 
tell the story. In }89}, written with his student Sobel, the problem considered 
is that of deciding sequentially in which of three intervals the mean of a normal 
distribution with known variance lies. The solution is a simple, practical, and 
approximate one, and consists of an adaptation of his original solution of the 
problem of two hypotheses. 

In 1943, when he was wholly absorbed in sequential analysis and trying to 
prove the optimum property of the sequential probability ratio test, we often 
speculated on an intuitive explanation of this property, which we believed to 
be true. Wald would wonder whether one could similarly effect great economies 
in the process of estimation. He early decided that no saving could be effected 
in the case of estimating the mean (say @) of a normal distribution with known 
variance. His criterion was to consider all procedures for which the infimum 
(with respect to @) of the confidence coefficient was the same, and measure 
efficiency by the supremum (also with respect to 6) of the expected number of 
observations. Several memoranda written for the Statistical Research Group 
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contained a proof of the fact that no saving could be effected, under restrictions 
which Wald found obnoxious. In 1946 he succeeded in removing these restric- 
tions. The result was proved independently by his student C. M. Stein. Both 
proofs were complicated. A joint simplified proof was published in {77}. This 
was a very pretty and significant result. (Its proof has since been considerably 
simplified and the result itself extended in [4].) 

5. Wald’s remaining statistical work falls under many headings. 

A. Theory of games. While this work is perhaps not strictly statistical it was 
motivated in part by his theory of decision functions. He became interested in 
the theory of games on reading [1] and recognized its connection with his own 
work. He was the first to prove (in |67}) that, if one player of a zero-sum two- 
person game possesses only a finite number of pure strategies, the game is de- 
termined. This result has long since been generalized by Wald himself in {78}, 
where he needed a similar theorem for his theory of decision functions; condi- 
tional compactness with respect to an ‘intrinsic’? metric replaces finiteness. 
His most general result on the determinateness of a game is Theorem 2.23 of 
{94}; this result was obtained independently by Karlin [5]. 

B. Asymptotic results connected with the method of maximum likelihood and 
the likelihood ratio test. These include papers {43}, {46}, {48}, {57}, [S81], and 
{88}. The last of these contains the neatest and most expeditious of the rigorous 
proofs of the consistency of the maximum likelihood estimate yet available in 
the literature. 

C. Work on nonparametric inference. Nonparametric inference is that branch 
of statistics which deals with the case where the unknown distributions cannot 
be specified by the values of a finite number of parameters. His first paper on 
the subject was {34}. Here a completely unknown (except that its continuity 
is assumed) distribution function is estimated by a ‘belt’? which is a function 
of the observations. The estimation is made in the sense of Neyman [6]. The 
paper {34} was followed shortly by {40}. The title tells the story of the contents 
here; the test proposed was based on the number of runs. The notion of consist- 
ency of a test was introduced in a manner generalizing the notion of the con- 
sistency of a point estimate. 

In {50} and {52} Wald took up the notion of tolerance intervals (due to 
Wilks [7)). A tolerance interval is an interval which is a function of the observa- 
tions in one sample and which, with prescribed confidence coefficient, contains 


at least a specified fraction of another sample. By a very simple device using 


conditional distributions he extended Wilks’ univariate solution to the multi- 
variate case in |52{. Paper {50} is based on normal approximation theory. 

In {59} and {64} tests fora number of nonparametric problems are based on 
permutations of the observations. Limiting distributions are obtained. Much 
further remains to be done. (See also [13].) 

D. Miscellaneous statistical results. 

1. {38} gives confidence limits for the intraclass correlation coefficient in 
what is now called Model II of the analysis of variance. The method is forth- 
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right and simple. This was one of Wald’s earliest statistical papers, and the 
problem was put to him by Hotelling. The result is generalized in {45}. 

2. {41} gives a consistent method of fitting a line when both variables are 
subject to error. (In ordinary least squares theory only one variable is subject 
to error.) For a recent result on this subject the reader is referred to [14] and [15]. 

3. {44}, written with his student Ralph Brookner, is one of his papers on 
multivariate analysis. It gives the distribution, under certain conditions, of the 
statistic derived by Wilks by use of the likelihood ratio to test the independence 
of groups of jointly normally distributed chance variables. The statistic is the 
quotient of products of determinants of sample correlation coefficients. 

4. {49}, written with H. B. Mann, considers optimizing the procedure of 
testing goodness of fit by means of the x* test. A metric is introduced into the 
totality of alternatives. The authors decide to maximize the smallest value of 
the power of the test for alternatives which are neither too far away from the 
null hypothesis (such alternatives are easy to detect) nor too near the null 
hypothesis (such alternatives it is almost hopeless to detect). They conclude 
that the class intervals should all have the same probability under the null 
hypothesis and when the sample size N is large should be of the order N’” in 
number. (A precise result is described.) Although this is a problem of great 
importance little new has been added to their result. 

5. {51} generalizes a result of P. L. Hsu [8] on the power function of the 
analysis of variance. This work has since been continued by others (Hunt and 
Stein, Lehmann, and the writer). The proof of {51} was shortened in [9]. 

6. {54} tries to give a measure for the efficiency of a design for testing a linear 
hypothesis. This is a pioneer effort. No further work on this important problem 
has to my knowledge been done. 

7. {60} treats the problem of classifying an individual into two groups. Each 
group has the same p characteristics jointly normally distributed with a com- 
mon covariance matrix, but with differing means. The statistic actually em- 
ployed is a slight modification of the one given by the probability ratio test. 
The problem is one of finding a distribution, and is not solved completely. 
(Wald had to be urged to publish this paper.) The problem is one of great prac- 
tical importance and work is being done on it at present (see [12]). 

8. {65} was a venture of Wald’s into statistical quality control. It was stimu- 
lated by a paper by Dodge [10]. It contains, inter alia, a sampling inspection 
scheme which guarantees a prescribed lower bound on the outgoing quality, 
and, in the case of statistical control, requires a minimum of inspection. 

9. {83} contains some results on a problem first raised by Neyman and Scott 
[11], that of estimating a fixed number of (structural) parameters when each 
successive observation depends also upon a new (incidental) parameter. This 
is a problem of great importance which is still far from being solved. 

The above list of papers is far from exhaustive. It illustrates the multitude of 
problems attacked by Wald and his amazing fertility. Many of the problems 
are of Wald’s own formulation. His amazing gift for the ‘‘practical”’ theory, 


coupled with his mathematical powers, is to be seen everywhere in these papers. 
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6. Wald himself was a man rather completely immersed in his work. His 
other chief interest was his family. He had played the violin before coming to 
the United States, but under pressure of work had never resumed playing. 
Hiking was his chief diversion; he was an indefatigable walker and some of his 
joint papers were worked out on long hikes. He was helpful and kind to his 
students, and more than generous in sharing credit with others. To the end he 
was modest and unassuming, with an unusual aversion to all forms of contro- 
versy. 

In 1941 he married Lucille Lang, who met her untimely end with him. Two 
children survive—Betty, born in 1943, and Robert, born in 1947. He was much 
absorbed in his family life and very devoted to the children. 

It is difficult to see the limits of Wald’s influence on our young science of 
mathematical statistics. How tragic it is that we have lost him in his prime, 
when there is so much vet to do, and when there are still so few who can follow 
in the trails he blazed. 
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THE FORMATIVE YEARS OF ABRAHAM WALD AND HIS WORK 
IN GEOMETRY 


By Kart MENGER 
Illinois Institute of Technology 


In the fall of 1927, a man of 25 called at the Mathematical Institute of the 
University of Vienna. Since he expressed a predilection for geometry he was 
referred to me. He introduced himself as Abraham Wald. In fluent German, 
but with an unmistakable Hungarian accent, Wald explained that he had car- 
ried on most of his studies at the elementary and secondary school levels at 
home, mainly under the direction of his older brother Martin, a capable elec- 
trical engineer in Cluj (Kolozsvar, Klausenburg). He had just arrived in Vienna 
in order to study mathematics at the university. Geometry had interested him 
ever since he was fourteen. More recently he had been reading Hilbert’s “Grund- 
lagen der Geometrie’”’ (Foundations of Geometry) and he saw possibilities for 
improving these foundations by omitting some postulates and weakening others. 
I suggested to Wald that he write up his results {5}! (one of his proofs was later 
incorporated into the seventh edition of Hilbert’s book) and at the same time 
recommended some additional reading. 

Wald enrolled in the university, but during the next two years Vienna did 
not see much of him. The system of complete freedom which at that time pre- 
vailed in the universities of Central Europe—a detrimental system for weak 
students—kept the gifted ones from wasting semesters on courses the content 
of which they could absorb in a few weeks of concentrated reading. Moreover, 
Wald had to serve in the Rumanian army. 

It was not until February 1930 that he and I again had extended conversa- 
tions. Then he came unexpectedly to hand me a manuscript which purported 
to contain the solution of a famous problem. It was a serious piece of work, 
but an error at the very end invalidated the result. Wald was visibly disap- 
pointed. But a few days later he returned to tell me that, during the last week, 
he had been sitting in on my lectures on metric geometry—the first university 
lectures he ever attended—and that he planned to follow this entire course 
Moreover, he wanted to try his hand at some problem in this field. I had just 
introduced the ‘‘between” relation in metric spaces: The point q is between 
the points p and r if, and only if, p = q = r and the three distances between 
the points satisfy the equality 


d( p,q) + d(q,r) = d(p,r). 


I asked Wald whether he would like to try to characterize this ‘‘betweenness”’ 
among the ternary relations in a metric space. Four weeks later he brought me 


1 References are listed in ‘‘The publications of Abraham Wald,” pp. 29-33 of this issue. 
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the first draft of the solution which he subsequently published in the Mathe- 
matische Annalen |1!, {2}, {3}, {7}. At the same time he asked for another 
problem. 

It seemed to me that Wald had exactly the spirit which prevailed among the 
young mathematicians who gathered together about every other week in what 
we called our Mathematical Colloquium; so I at once invited him to present 
his result there. Gédel and Nébeling, Alt and Beer were among the regular 
participants in this Colloquium; Miss Taussky came whenever she was in 


Vienna; Cech, Knaster, and Tarski were frequent guests; and numerous stu- 


dents and visitors came from abroad, especially from the United States and 
Japan. It was in this stimulating atmosphere that Wald spent his formative 
years. In these colloquia he became familiar with important problems, and pre- 
sented the remarkable solutions which he published in the /rgebnisse eines 
Mathematischen Kolloquiums. 

Wald’s second course at the university was on dimension theory. I had sug- 
gested that topology might be developed in spaces other than point sets. Instead 
of “points,” “‘pieces’’ might be the undefined basic concept. Certain nested 
sequences of pieces might be called points. Wald succeeded in characterizing 
the nested sequences which should be so named {4}. 

After this excursion into topology, Wald returned to metric geometry. In 
1928, I had characterized the metric spaces congruent to subsets of the n-di- 
mensional euclidean space or of the Hilbert space. Wald solved the correspond- 
ing problem for the n-dimensional complex space (in which each point is given 
by n complex coordinates) as well as for all indefinite spaces where the coordinates 
of the points are real, but the square of the distance between two points is given 
by an indefinite quadratic form rather than by the definite sum of squares which 
goes back to the law of Pythagoras {14}. An unpublished manuscript, “On ab- 
stract fields and metrics,” has been found. The paper is a continuation of the note 
of Miss Taussky in Issue 6 (pp. 20-23) of the Ergebnisse. 

These studies aroused in Wald an interest in determinants {8} and led him 
to the following discovery. Let S be a four-dimensional simplex. It has 10 sides 
and 10 triangular faces. Geometers had known for a long time that the volume 
of S is determined by the lengths of its 10 sides. Is this volume also determined 
by the areas of the 10 faces? Wald constructed two simplexes with equal faces 
but different volumes }9}. 

At that time Wald also became interested in Steinitz’s theorem on the sums 
of series of vectors—a generalization of Riemann’s famous result that any not 
absolutely converging series of real numbers «an, by a permutation of its terms, 
be made to converge toward any number. Steinitz’s theorem states that the 
vectors of an n-dimensional space, toward which a series of vectors can (by a 
permutation of the terms) be made to converge, form a linear manifold. Wald 
gave a new proof of the theorem and extended it to spaces of infinitely many 
dimensions. Moreover he studied series of group elements {11}, {12}, {13}. 


In order to enhance the analogy between the postulates for Lebesgue measure 
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and the postulates I had formulated for dimension, Wald developed a charac- 
terization of L-measure among set functions in which he confined the additivity 
postulates to closed sets {17}. Let uw(S) be a set function which (as, e.g., Lebes- 
gue’s exterior measure) is defined for every subset of the euclidean n-space and 
satisfies the following conditions: (1) w(S) > 0 for every S; (2) S’ C S implies 
u(S’) < w(S); (3) u(C)) + w(Ceo) = uw(C; + C2) for any two disjoint closed sets; 
(4) u(>.2 Cp -< > we u(C;) for every sequence of closed sets; (5) u(Z) = 1 for 
any unit n-cube J. Then, for every L-measurable set, u(S) is equal to the L- 
measure of S. 

In later years, he and | often joked about the fact that he took only one more 
course at the university before getting his Ph.D. This third course dealt with 
the new development of projective and affine geometry based on the operations 
of joining and intersecting which Garrett Birkhoff several years later called 
lattice operations. As a result of his studies in this field, Wald took active part 
{6} in the discussions of this subject which G. Bergmann, Alt, Schreiber, and 
myself carried on in the colloquium. 

Another favorite topic of discussion there was the idea of curvature. On this 
subject Wald did his masterpiece in the field of pure mathematics. By virtue of 
the triangle inequality 


d(p, q) + d(q, r) 2 d(p, r), 


three points p, q, r of a metric space are always congruent to three points of 


the euclidean plane. Consider the circum-circle of these latter points and call 
the reciprocal of its radius the curvature of the points p, q, r. This curvature is 
zero if and only if one of the three points is between the other two. Now let A 
be an are contained in a metric space. Its points need not be given by coordi- 
nates, and its shape is not necessarily described by equations or functions. All 
that is assumed is an ordered continuum with a distance defined for every pair 
of points. For this general situation, I defined the curvature of A at the point 
a as the number (if it exists) from which the curvature of any three points dif- 
fers arbitrarily little, provided all three points are sufficiently close to a. Nu- 
merous theorems were proved about this general curvature of curves, and about 
modifications of this concept due to Alt and Gédel. But the main problem was, 
of course, the extension of the idea to higher-dimensional manifolds. 

From the outset it had been clear that, on a surface, quadruples of points 
should be considered. But what number should be associated with a given 
quadruple of points of a metric space? Four congruent points in the euclidean 
space do not necessarily exist; and even if they do exist, the radius of their 
circum-sphere is of no particular significance. Wald considered spheres metrized 
by the lengths of the ares of great circles. For positive k, let S, denote the sphere 
of curvature k thus metrized; Sp is the euclidean plane; for negative k, let S; be 
the hyperbolic plane of curvature k. If four points of a metric space are given, 
what S, contains four congruent points? The difficulty of the problem is illus- 
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trated by the fact that for some quadruples of points no such S, exists, whereas 
for some other quadruples there is more than one such S, . 

Wald overcame these and other difficulties. He proved {18}, {19}, {22} that 
if S is a surface of the type studied in classical differential geometry, then for 
each point a there exists a number x(a) with the following property: in S, for 
every quadruple of points each of which is sufficiently close to a, there exists a 
congruent quadruple in an S,; of which the curvature k differs arbitrarily little 
from x(a). Moreover, he proved that x(a) is equal to the famous Gauss curvature 
of S at the point a. Thus he obtained a new and very natural way of introducing 
Gauss’s curvature. Even if he had stopped at this point, his result would have 
been a remarkable achievement. 

But here was the beginning of Wald’s really great work. In the second part 
of his paper {22} he dropped the assumption that a surface of the type studied 
in classical differential geometry be given. He dispensed with the characteriza- 
tion of points by coordinates and of surfaces by equations or functions or para- 
metrizations. Continuing the idea which I had used in the simpler case of ares, 
he merely assumed a compact metric space S with the following properties: 
(1) S is what I had called convex; that is, for any two points p and r of S, there 
exists a point between p and r; (2) S, at every point a, has a curvature x(a) (the 
symbol used in Wald’s sense, for Gauss’s definition of curvature is obviously 
inapplicable in this general situation). The second property means that for any 
four points of S which are sufficiently close to a, there exist four congruent 
points on a sphere S, where k differs arbitrarily little from «(a). From this 
simple assumption Wald deduced (1) that S is a surface; (2) that in this surface 
polar coordinates can be locally introduced; (3) that in terms of these coordi- 
nates the length is expressed as it is on the classical surface of differential geom- 
etry; (4) ‘at, for each point a of S, the number «(a) is equal to the Gauss cur- 
vature at a of the classical surface created on S by the introduction of the polar 
coordinates. 

I venture to predict that the theorem just stated will become a cornerstone 
in the geometry of the future. This development may not please the devotees of 
classical differential geometry, for the theorem reveals serious redundancies in 
their assumptions. The essential features traditionally postulated (that is, coor- 
dinates which characterize points, parametric representations of surfaces, and 
of course, the differentiability of functions) can be derived. In fact, they can be 
derived from the one simple assumption of a convex compact metric space 
which at every point admits a Wald curvature. This result should make geom- 
eters realize that (contrary to the traditional view) the fundamental notion of 
curvature does not depend on coordinates, equations, parametrizations, or dif- 
ferentiability assumptions. The essence of curvature lies in the general notion 
of a convex metric space and of a quadruple of points in such a space. Some day 
these simple notions will be recognized as an adequate foundation for those 
local geometric properties the study of which for the last 250 years has been 
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monopolized by differential geometers with their complicated conceptual ma- 
chinery. 

At this point I must interrupt the story of Wald’s work and insert a few re- 
marks about his life. He received his Ph.D. in 1931. At that time of economic 
and incipient political unrest, it was out of the question to secure for him a 
position at the University of Vienna, although such a connection would cer- 
tainly have been as profitable for that institution as for himself. Outside of the 
Colloquium, my friend Hahn was the only mathematician who knew Wald 
personally. No one else showed the slightest interest in his work. However, 
Wald, with his characteristic modesty, told me that he would be perfectly satis- 
fied with any small private position which would enable him to continue his 
work in our Mathematical Colloquium. I remembered that my friend Karl 
Schlesinger, a well-to-do banker and economist, wished to broaden his knowledge 
of higher mathematics; so | recommended Wald to him. 

Out of the association between these two men grew Wald’s interest in the 
equations of economic production. I asked Schlesinger to present his formulation 
of the equations to the Colloquium. Subsequently Wald published papers 
{15}, {21} on these ideas in the Ergebnisse, the first publications in his long list of 
contributions to mathematical economics. They have become classics in the field. 
Here, for the first time, economic equations were not merely formulated. The 
number of equations was not merely compared with the number of unknowns. 
The equations were solved. It was Schlesinger’s modification of the original 
equations of Walras and Cassel which made them soluble. Soon after, I recom- 
mended Wald to Oskar Morgenstern, then director of the Austrian Institute 
for Business Cycle Research (Konjunkturforschung), and Morgenstern gave 
him employment in the Institute. 

At that time there occurred a second event which proved to be of crucial 
importance in Wald’s further life and work. The Viennese philosopher Karl 
Popper, now professor at the London School of Economics, tried to make pre- 
cise the idea of a random sequence, and thus to remedy the obvious shortcomings 
of von Mises’ definition of collectives. After I had heard (in Schlick’s Philo- 
sophical Circle) a semitechnical exposition of Popper’s ideas, I asked him to 
present the important subject in all details to the Mathematical Colloquium. 
Wald became greatly interested and the result was his masterly paper on the 
self-consisteney of the notion of collectives {29} in the Ergebnisse. He based 
his existence proof for collectives on a twofold relativisation of that notion. 

Let ./ be a (finite or infinite) set of symbols, such as H(ead) and T'(ail), or 
1, 2, 3, 4, 5, 6, or the points inside of a given square of the plane. By a selection 
of nth order Wald means a function /, associating with every ordered n-tuple 
of elements m, --- , m, of Moa value f,(m,,--- , m,) which is either 0 or 
1; by a selector (Auswahlvorschrift), a sequence S = }{fo,fi,--: ,f 


’ 


A selector makes it possible to select from every sequence of elements 


i 


[my , Mo,*s, m,, +++} Of Ma subsequence 


Sim, , Me, ees, oof = [Mi , M; 


a) 
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by the following procedure: 


m;, 18 the first element such that f;,1(m , ++ , mi) = 1, 


m;, is the first element after m;,_, such that fi,-1(m , «++ , my—1) = 1, 
and so on. Clearly, the selected subsequence may be infinite, finite or even 
vacuous. 

Now let there be given (1) a set S of selectors (including the identity which 
associates every sequence with itself); (2) a set M of subsets of M. Then Wald 
calls a sequence {m,,---,m,,---} an (©, M)-collective if for every set M* 
belonging to Mt there exists an S-probability P(M*, S) in the following sense: 
every selector S belonging to S selects from {m,,---,m,,-+--} a subse- 
quence S}m,,---,m,,-°**} = {mi,,-+**,ms,,-°**} such that the relative 
frequency of M* among the initial segments of the subsequence [that is, the 
number of elements of !@* among {m,, , --- , m:,} divided by n] converges to 
P(M*, S) asn— a, 

The two parameters S and 9? must be given if collectives are to be discussed 
in a self-consistent way. Although Wald’s relativisation restricts the original 
unlimited (but unworkable) idea of collectives, it is much weaker than the 
irregularity requirements of Copeland, Popper, and Reichenbach. In fact, it 
embraces these requirements as special cases. 

It was through this work on collectives and a study of time series {24} under- 
taken at Morgenstern’s suggestion that Wald became interested in the founda- 
tions of statistics. But he kept on working at geometric problems, and added 
interesting remarks }25}, {26} to my first applications of metric methods to 
the calculus of variations. 

Meanwhile the political situation in Austria deteriorated from month to 
month. The Ergebnisse was criticized (with specific reference to Wald) for its 
large number of Jewish contributions just when I felt that we ought to honor 
that journal by making Wald co-editor. Issue 7 was edited by Gédel, Wald, 
and myself. But Issue 8 containing Wald’s paper on collectives was destined to 
he the last of the series. Hahn was dead. Schlick had been assassinated. Viennese 
culture resembled a bed of delicate flowers to which its owner refused soil and 
light while a fiendish neighbor was waiting for a chance to ruin the entire gar- 
den. I leit the country. A year later Hitler marched into Vienna. Schlesinger, 
who occupied a rather prominent position, chose death that same day. These 
events foreshadowed the fate which later overtook Wald’s family to which he 
was deeply attached. His parents and his sisters were murdered in the gas 
chambers of Ossoviec (Auschwitz); his brother Martin, the engineer, perished 
as a slave laborer in Western Germany. 

Wald himself continued for a few weeks after Hitler’s arrival in Vienna. He 
was dismissed by Morgenstern’s successor but not otherwise molested. But I 
was greatly worried about his future as long as he remained in Austria, and with 
other friends, I tried to get him to the United States. Thanks to his work in 
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econometrics and statistics he was permitted to come. Economists and statisti- 
cians soon became aware of his potentialities, and from the outset he was grati- 
fied to feel that this country would make effective use of his talents and abilities. 

When he ceased working in the field of geometry, it was not for lack of in- 
terest. It was for lack of time. Whenever he and I met during the summer (we 
usually spent our vacations together in the mountains) we discussed both geom- 
etry and statistics. Wald’s last geometric papers, {53}, {63}, date from 1943. By 
a strange coincidence, they deal with the “between” relation to which his first 
publication was devoted—but on a different level. In 1942 I had introduced a 
statistical metric in which the distance between two points is a distribution 
function rather than a number. Wald improved my original triangle inequality, 
upheld the definition of betweenness by a triangle equality, and proved that, 
even on the statistical level, betweenness has the properties by which, in 1930, 
he had characterized it among the ternary relations in metric spaces. 

I have often wondered what would have happened if Wald had continued 
his geometrical work. A safe conjecture is that, if he had returned to geometry, 
that subject would have been greatly benefited. He and I had planned to work 
wut a new differential geometry and vector analysis. If we had succeeded, a 
metric theory of the curvature of higher-dimensional spaces would now be in 
existence. 


Another probable conjecture is that his geometric work would not have found 
the acclaim accorded to his work in applied mathematics. Geometry is not 
fashionable today. Although it is bound to outlive some current ephemeral 


fashions, even Wald’s powerful talent would probably not have turned the tide. 
He might have remained a great but relatively unknown geometer. 

Be that as it may, what Wald actually accomplished in geometry is of the 
first importance. I realize the high value of his papers on econometrics and of 
his book on sequential analysis, and I am aware of the profound influence which 
his theory of decision functions is bound to exert for decades to come. But 
nevertheless I believe that anyone who really understands his theory of the 
curvature of surfaces will find that this work is second to none of Wald’s other 
achievements. 





ABRAHAM WALD’S CONTRIBUTIONS TO ECONOMETRICS! 
By G. TINTNER 
Towa State College 


The untimely death of Professor Wald has deprived econometrics of a man 
who kas made many important contributions to mathematical economics and 
statistics. For an account of Wald’s life and work we refer to the moving and 
brilliant essay by Morgenstern [20].? Many of Wald’s numerous publications 
are of direct or indirect interest to econometricians. They may be conveniently 
classified into the following groups: 

(1) The existence of meaningful solutions for systems of equations in mathe- 
matical economics. 

(2) Indifference systems and cost of living index numbers. 

(3) The minimax principle and its relationship to nonstatic economics. 

(4) Elimination of the seasonal variation. 

(5) The problem of identification. 

(6) Stochastic difference equations. 

There are also other contributions to mathematical statistics which may be 
of interest to econometri¢s. Among these we mention only: his criticism of the 
variate difference method {24}*, (ef. Tintner, [24], pp. 14 ff.); his method of 
deriving linear relationships between variables which are subject to error {41}, 
(ef. Bartlett, [5}) and his nonparametric test for autocorrelations {59}. 

1. Meaningful solutions of systems of equations in mathematical economics. 
Wald deals first with the equations of the theory of production {15}, {21}, 
{23} (ef. Stigler, [23], pp. 243 ff.). Each product is produced by a combination 
of various factors of production. There is no saving. The coefficients of produc- 
tion are constants. The commodities may be free goods, if there is a surplus. 
The theory yields an answer to the question: which of them will ultimately be 
free goods? There are demand functions for all the products which depend upon 
the prices. 

Under which conditions has this system a unique and nonnegative solution 
in the quantities of products, the prices of the products, the prices of the factors 
of production and the surpluses? 

The conditions under which this is the case are formulated by Wald as fol- 
lows: (1) a positive amount of each factor of production is available; (2) the 
amounts of the factors of production which are used for the production of the 
products are not negative; (3) there is at least one factor of production which 
is necessary for the production of each product; (4) the relationship between 
the prices and the quantities demanded of the products is continuous; (5) the 
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demand for each product is zero only for infinite price. Condition (6) is imposed 
upon the demand functions for the products. It is derived from a proposition 
in value theory: the marginal utility of each commodity is much more inflyenced 
by changes in the quantity of this commodity than of any other commodity. 
(7) The rank of the matrix of the constant coefficients of production is equal to 
the number of the factors of production used. 

The second case considered by Wald is the equations of exchange in their 
Walrasian form {21}, {23}. They are static equations. We assume free com- 
petition. Each individual maximizes his utility. He is restricted by budget 
equations and the markets are all cleared (total demand equals total supply). 
Under which conditions will the equations of exchange have a unique solution 
where the prices are positive and the final quantities after the exchange are also 
positive? 

The conditions are: (1) all initial quantities possessed are nonnegative; (2) 
each individual starts with a positive amount of at least one good; (3) for each 
good there is at least one individual who starts with a positive amount of this 
commodity. The conditions (4) and (5) restrict the form of the utility functions 
somewhat. They are identical with the Walrasian assumption that the marginal 
utility of each commodity is independent of all other commodities and decreases 
with increasing amounts of this good. The elasticity of the marginal utility of 
each good is infinite for the amount zero of this commodity and less than one 
throughout. 

The third case treated deals with the Cournot case of duopoly {23}. Costs 


are neglected. Each duopolist considers the supply of his competitor as given 
and adjusts his own supply in such a fashion as to maximize his profit. 

The results are as follows. (1) Assume that the demand function for the com- 
modity in question cuts the quantity and price axis and has a continuous first 
derivative. Then there is at least one equilibrium point which is reached from 
any initial situation. (2) The demand function cuts the quantity and the price 
axis. It has a first and a second derivative. The first derivative is negative, the 


second not positive. Then the reaction function is unique, continuous and de- 
creases monotonically. There is exactly one equilibrium point which is stable. 

The great importance of these investigations for mathematical economics 
should not be underestimated. It would be most important to obtain analogous 
results for other cases of economic systems, for example, various market organi- 
zations. 

The mathematical methods used in the proofs are essentially topological and 
related to the theory of sets. The command of advanced methods is unmatched 
except for articles by Menger [18], [1] and von Neumann [28], [29]. It is quite 
comparable to the beautiful theory of von Neumann and Morgenstern [30]. 
Wald has also contributed an important review of this theory {74} and used it 
in connection with his own ideas on statistical inference ({94}, pp. 24 ff.). 

It should also be mentioned that methods similar to those used by Wald in 
this field have recently been employed by a number of authors in the discussion 
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of activity analysis of production and allocation (Koopmans, [13]). This im- 
portant field, which is related to Leontief’s earlier verification of the tableau 
économique [15] is of supreme importance for economic theory and policy. 

2. Indifference systems and the cost of living index. Three related investiga- 
tions of Wald deal with this problem. Following the work of Konues [12], 
Haberler [9], Leontief [14], and Staehle [22], he improved the limits of cost of 
living index numbers {28}. 

Assume that there are two situations, called one and two. The market prices 
of all commodities consumed and the quantities consumed are known. The 
indifference system is the same in the two situations. What are the limits of 
the cost of living index numbers which correspond to situation one or situation 
two? 

Wald derives all limits which hold for all possible cases. He also shows that 
these limits cannot be improved, if nothing else is known. 

He succeeded in narrowing down the limits if the following condition holds. 
Consider two situations which lie on the same indifference curve. It is assumed 
that not all commodities have a larger or all commodities a smaller marginal 
utility in one situation than in the other. 

This theory gives us limits for the cost of living index numbers. If we make 
the assumption that the utility function can be approximated by a polynomial 
of the second degree in a small region we can actually obtain an approximation 
to the true cost of living index, as Wald has shown {28}, {36}. His work is 
related to earlier efforts by Bowley and Frisch (Frisch, [6]). 

The above assumption implies the linearity of the Engel curves. The Engel 
curves show the dependence of the consumption of each commodity upon 
income, prices being constant. Let FE be the expenditure (or income) and q; the 
quantities of good 7 consumed in situation one. Then the Engel curves in situa- 
tion one are: 


ga =abk+b;, 
where a; and b; are constants which can be determined by a statistical study of 
family budgets (see, e.g., Allen and Bowley, [1]). 
Let q2 be the consumption of commodity 7 in situation two. Then the Engel 
curves are in situation two: 
gq = ak +b. 


The approximation to the true cost of living index number is then: 


Laipi_ Cdipi — (Coin VT alps Daipi/E api 
Vo Laipi Ex(1 + VD aipi- D> aipi) , 


where p; and p; are the prices of commodity ¢ in periods one and two, £, is the 
total expenditure in situation one. The summations are extended over all com- 
modities. 
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Wald shows that the index number of the double expenditure method (Frisch, 
[6]) is a special case of his own cost of living index number, under certain condi- 
tions. If the Engel curves are straight lines through the origin then both for- 
mulae are equivalent and lead to the celebrated ideal index number of Fisher. 
This is the geometric mean of Laspeyres’ and Paasche’s index numbers. 

The importance of these results for theory and policy is great, as pointed out 
recently by Hotelling [10]. It is deplorable that they have not become more 
widely known among economists and economic statisticians interested in the 
cost of living index number problem (see however Ulmer, [27], pp. 81 ff.). 

Wald’s work on the empirical determination of indifference surfaces {39} is 
closely related to his writings on index numbers. He assumes again the existence 
in a certain region of a utility function or indicator which is a second order poly- 
nomial in the quantities consumed. The Engel curves are hence linear. From a 
knowledge of the Engel curves we are able to find an approximation to the 
(integrable) utility function or indifference system. Tests for the stability condi- 
tions are given. The integrability conditions are applied. The calculation of the 
demand functions of the individual goods is indicated. 

It is astonishing that until recently nobody has applied this theory to the 
determination of empirical indifference systems. Professor J. A. Nordin of Iowa 
State College has undertaken this task not long ago with good success. In view 
of the enormous amount of data available from budget studies it is to be hoped 
that his example will soon be followed by more econometricians. 

3. The minimax principle and its relationship to nonstatic economics. The 
novel statistical ideas introduced by Wald into the theory of statistical inference 
are possibly of great importance in economic statistics. Generalizing the Neyman- 


Pearson theory of testing hypotheses, he bases himself upon a completely prag- 
matic approach {94}. The values of the losses which are incurred by committing 
errors of various kinds must be known. Then it is possible at least in principle 


to choose the action which is in a sense most advantageous; this action mini- 
mizes the risk among all possible actions. It should be mentioned that Wald 
advocated the minimax principle in a tentative way and because of certain formal 
advantages. I am informed that he was still interested in finding a less conserv- 
ative and more satisfactory principle for statistical inference. 

To my mind, it is somewhat doubtful if principles of this kind are really 
applicable in the social sciences [26]. They are without any doubt applicable in 
industrial applications (quality control, etc.). Wald has made many important 
and interesting contributions to this field; for example, his rightly celebrated 
sequential analysis {76}. But it is somewhat doubtful if a similar pragmatic 
approach is applicable in the field of social policy. How can we evaluate and 
measure the comparative advantage and disadvantage of various actions in 
this field, for example, gains and losses produced by the adoption of competing 
social policies? Arrow’s recent work in this field [3] proves the nonexistence of a 
social welfare function, except if very special conditions hold. This result points 
detinitely toward the impossibility of a pragmatic approach. 
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It should also be remarked that the minimax principle even if it is applicable 
leads to an extremely conservative policy. Wald used the simile of a zero-sum 
two ;on game (von Neumann and Morgenstern, [30]). He considers a game 
of persons, Nature (in our case Society) against the statistician. It is postu- 
laicd in the theory of games that each player gains what the other loses. We 
have to assume that Nature is trying to ruin the statistician, who has to take 
the action which will be the best possible even if Nature is trying to do its worst. 
But actually Nature is supremely indifferent to the statistician. Hence the ac- 
tions of the statistician if based upon Wald’s minimax principle will be very 
conservative (L. J. Savage, [21]; K. J. Arrow, [4]). 

Hurwicz [11] and Marschak [16] have pointed out that Wald’s principles 
may also be applicable to the theory of maximizing profits or utility in cases of 
incomplete information. This idea has never been fully elaborated. There is, 
however, not any doubt that indeed the principle is applicable, if we are willing 
to adopt the very conservative attitude which it implies. 

In connection with his work on decision functions Wald considered also a 
problem which is of paramount importance in economic statistics. This is the 
problem of multiple choice ({76}, pp. 138 ff.). Here we have to choose not between 
two hypotheses as in the Neyman-Pearson theory of testing hypotheses but be- 
tween any number of hypotheses. Even if we are unwilling to use the minimax 
principle in this field it is of the utmost importance that the statistician’s in- 
terest has been directed toward these crucial problems. Two examples which 
are of great importance in econometrics are as follows. 

Suppose we want to fit a polynomial trend to a series of given data. We do 
not know if a straight line, a parabola, a cubic, a quartic, etc. gives the most 
adequate fit. The methods given by Fisher give no adequate answer to the 
problem of the degree of the polynomial which ought to be chosen. This is essen- 
tially a multiple choice problem. 

Another example is as follows. Let us assume we have a number of observa- 
tions on a certain set of economic variables which are subject to errors. We want 
to know how many linear relationships exist between the systematic parts of 
our variables in the population from which our sample is taken. Again, the 
methods proposed by Geary [7], Anderson [2], and the author [25] are inadequate 
because we are faced with a multiple choice problem. 

Wald’s work in this field opens up the possibility that econometricians and 
statisticians may be able to deal with problems of multiple choice in a more 
rational manner than before. 

4. Elimination of the seasonal variation. This is perhaps the earliest work of 
Wald concerned with economic statistics {24}. It has been very well presented 
by Mendershausen [17]. Hence we can summarize it briefly. 

The following assumptions underlie the method. The given time series is the 
sum of three components: one component represents the nonseasonal move- 
ment, that is, the combined effect of trend and cycle; the second part is the 
season; the third a random element. 
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The following hypotheses are made concerning the individual parts of the 
series: (1) the difference between the first component (combined trend and 
cycle) and its moving twelve-month average is negligible; (2) the mean value 
of the random component and its moving twelve-month average is approxi- 
mately zero; (3) the seasonal component can be represented as the product of 
a periodic function with a period of twelve months and another function which 
changes only slowly with time; (4) the twelve-month moving average of the 
seasonal component is approximately zero; (5) the slowly changing function is 
approximately constant in a short time interval; (6) the slowly changing com- 
ponent may be replaced by an approximation which assumes that it is constant 
over one year. The approximation is computed by the method of least squares. 

The seasonal component is then computed in the following way. First the 
moving twelve-month averages of all items of the given monthly series are 
computed. The raw seasonal factor is the average of the difference between the 
original series and the corresponding twelve-month moving average for the 
same item. This average is formed for each month. The raw values are corrected 
so that their sum is equal to zero. This permits then the computation of the 
series Which is approximately free from seasonal fluctuations. 

Wald’s method of eliminating the seasonal fluctuations has many rivals. 
But it seems preferable in most cases because it is based upon assumptions 
which are very reasonable from the point of view of economics. It will certainly 
serve to eliminate approximately the season if it is changing not too rapidly. 
But if the season should change very rapidly some more complicated adjust - 
ments should be used which are also given by Wald. 

5. The problem of identification. The question of identification has come into 


the foreground of econometric discussion since the important pioneer investi- 


gations of Haavelmo [8]. The problem is as follows. Suppose we have a system 
of equations which describes the behavior of the individuals in an economy and 
may also include definitory relations. The behavior equations include random 
terms which are the result of certain variables which have not been included in 
the system. The general form of the joint distribution of these random variables 
is known, Then we may ask: under what assumptions can we determine uniquely 
the equations which express the behavior of the individuals in the economy? 

In the fundamental work edited by Koopmans we find two contributions of 
Wald to this subject: a note on identification of economic relations }92}, and 
a contribution concerning the important problem of the estimation of unknown 
parameters in an incomplete system of equations }93}. 

This field of inquiry is so new that it is not yet very easy to see its permanent 
importance. There are only very few empirical investigations which use the 
methods and take care of the problem of identification. But the contributions 
of Wald are of such generality that they are bound to play an important part 
in the future development of these ideas. 

The importance of the question of identification has been shown very force- 
fully by Haavelmo [8]. It is of paramount significance for all investigations which 
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deal with the econometric verification of economic theories. But it also has very 
important implications for economic policy. If measures of policy are based 
upon statistical information the question of identification cannot be neglected. 
Hence Wald’s contributions are also of great interest to anybody concerned 
with economic policy. 

6. Stochastic difference equations. Among the many important contributions 
of Wald in the field of economic statistics we mention only his treatment of 
linear stochastic difference equations, since this is of great value for econo- 
metrics {57}. He discussed several cases. 

Suppose first we have a single linear stochastic difference equation. This is a 
linear difference equation which includes a random term. We assume that the 
random terms are independently (but not necessarily normally) distributed 
lhe stochastic process is stationary. This excludes a trend. Then it can be shown 
that the application of the classical method of least squares leads to consistent 
estimates of the constants in the difference equation. In the limit, that is for 
large samples, the estimates are jointly normally distributed. The covariance 
matrix of the estimates is ascertained for the large-sample case. This permits 
the testing of hypotheses. 

A more complicated case is a system of simultaneous linear stochastic ditfer- 
ence equations. The random terms are not autocorrelated and the total system 
is stationary. If the matrix of the coefficients which involve no time lag is the 
unit matrix, then the maximum likelihood estimates are shown to follow from 
an application of the method of least squares to each single equation in the sys- 
tem. The large-sample limiting distribution of the estimates is derived. They 
are again jointly normally distributed and their covariance matrix is estimated. 

It is also shown that in the general case there are no unique estimates. We 
need a priori restrictions on the coefficients of the system or on the covariance 
matrix of the random elements. 

These contributions are of great importance in practical econometric work. 
It is to be hoped that they can be amplified and extended and that they will 
stimulate work leading to the derivation of small-sample distributions in this 
field. These would be more useful than the large-sample distributions derived 
by Wald but are much more difficult to obtain. 
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ON THE POWER FUNCTION OF TESTS OF RANDOMNESS BASED 
ON RUNS UP AND DOWN 


By Howarp LEVENE 
Columbia University 


1. Summary. It is shown that various statistics based on the number of runs 
up and down have an asymptotic multivariate normal distribution under a 
number of diflerent alternatives to randomness. The concept of likelihood ratio 
statistics is extended to give a method for deciding what function of these runs 
should be used, and it is shown that the asymptotic power of these tests depends 
only on the covariance matrix, calculated under the hypothesis of randomness, 
and the expected values, calculated under the alternative hypothesis. A general 
method is given for calculating these expected values when the observations are 
independent, and these calculations are carried through for a constant shift in 
location from one observation to the next and for normal and rectangular popu- 
lations. 

2. Introduction. Let the vector random variable X‘” = X,, --- , X, have 
the joint cumulative distribution function PF” = F*" (x, , --- , x,). Throughout 
this paper we will suppose that f°” is continuous. Let ©, be the class of all con- 
tinuous F”’, and let w, be the class of all F” of the form F“” = [[F‘” (x,), where 
F™ is some continuous univariate distribution function. By the hypothesis of 
randomness, /1) , we mean the hypothesis that F°"’, known to belong to Q,, 
actually belongs to w, . The statistical problem is to test Hp on the basis of one 
observation 2°"’ on X*”, 

Many methods of testing this hypothesis have been proposed. The most 
usual procedure has been for the statistician to devise some statistic whose 
distribution under the null hypothesis could be obtained without too much 
trouble. Then if extreme values of this statistic were observed, the hypothesis 
of randomness was rejected. Occasionally the appropriateness of the statistic 
would be considered. A common type of reasoning is that such and such a test 
classifies as random a set of numbers that are “obviously”? nonrandom, or vice 
versa. Now suppose we replace the original observations by their ranks. Then 
under the hypothesis of randomness all sequences of ranks are equally likely and 
each is as “random” as the next. On the other hand, if we look long enough, we 
will find something very peculiar and nonrandom about any given sequence, 
and can prove that the probability of this peculiarity arising by chance is very 
small. The difficulty is that randomness is not a property of a sequence of num- 
bers, but of the process that produced them, that is, of f°". Hence what we 
really want is a test with a high probability of rejecting Hy whenever F*” € w, . 
Unfortunately no such test exists. In fact, given any critical region of size a, 
there exists F“"’ € w, for which the probability of the critical region is zero. Two 
ways may be found out of this dilemma. The more satisfying from a theoretical 
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point of view is to restrict F'" to a class of alternatives especially feared, and to 
choose a statistic with reasonably good power against these alternatives. The 
second method is to restrict ourselves arbitrarily to a definite class of statistics 
which has desirable properties such as convenience, and then to choose an opti- 
mum statistic from this class. Both approaches will be used in this paper. Which- 
ever approach is chosen, it would be desirable to have a method of constructing 
a good test. We will exhibit one such method based on the second approach. 
However, in most cases no method of constructing a good test is known. It then 
becomes necessary to investigate the behavior of the power function of a number 
of previously devised tests, and to choose the one having the most desirable 
power function for the purpose in hand. In the present paper a start will be 
made in this direction for statistics based on runs up and down. These statistics 
have been independently discovered by a number of different authors and have 
been widely advocated for testing randomness. 

The continuity of F’”’ insures that, under Hy, Prob {X; = X,;} = 0 for all 
i ¥ j. This will also be true for F € Q, , for the type of distributions ordinarily 
considered. We will therefore assume that the observations (2;,---, 2,) are 
distinct. Let B* be the sequence of signs (+ or —) of the differences (2,4; — x;) 
fori = 1,--- ,n — 1. A sequence of p successive + (—) signs not immediately 
preceded or followed by a + (—) sign is called a run up (down) of length p. The 
term “runs up and down” (or u-runs) applies to both runs up and runs down. 
As an example, if the observations are (5 7 3 48 1), then B* = (+ — ++ -), 
there are four u-runs: one run up of length one, one up of length two, and two 
down of length one. 

Let s be the number of runs up, s, the number of runs up of length p, and s’‘, 
the number of runs up of length p or more in B*. Let ¢, t, , and ¢’, be similarly 
defined for runs down, and let r = s + t, rp = 8» + tp, rp = 8p + U,. Let k equal 
the total number of + signs in B*. The r’s, s’s, t’s and k will be called u-run 
statistics. Levene and Wolfowitz [1] have given the exact covariance matrix and 
expected values of the r’s, and Moore and Wallis [2] have given E(k) = 4 (n — 1) 
and o(k) = (n + 1)/12. 

3. Asymptotic distributions. When H, is true, certain recurrence relations for 
the exact distribution of a single u-run statistic are known, and Gleissberg [3] 
has tabulated the exact value of Prob (r — 1 > x) for n < 25 (Wallis and Moore 
[4] having given this for n < 12), but no usable exact distribution function is 
known or is likely to be found. Hence it is important to have asymptotic for- 
mulas. Wolfowitz [5] proved that under Hy any fixed set of u-run statistics have 
a joint multivariate normal distribution in the limit. (If the set chosen are 
linearly dependent in the limit, their joint limit distribution will be degenerate.) 
We will indicate Wolfowitz’s proof for the total runs r. There is no essential 
difficulty in generalizing to a set of u-run statistics. 

Let the sequence of observations be broken up into subsequences 


(3.1) X(j-1) 041 » Lj-1)a42 » °° * » Lie 
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where a = ni, 8 = n' approximately. Let r” be the number of runs in the jth 
subsequence. The partitioning of the original sequence breaks up some runs 
and forms some new ones, but at most two runs in each subsection are affected, 
so that 

8 


(3.2) Dr — 4) < 28. 

j=l 
But under Ho we have (1) r” and r™ are independent for i # j; (2) r” has the 
same distribution for all j; and (3) 1/ao’[r'?] and 1/a’u,[r'”], where us[r”] is the 
fourth moment of r‘” about its mean, can be shown to approach fixed limits 
~ 0 asn— ~. Hence the Lyapunov theorem applies and 


> {r? ro E{r*}) 


Vn 


is asymptotically normally distributed with zero mean and finite variance. But 


| () _ | 
(3.3) [rr =e! <3 ww 40 
Vn Vn 


so r is likewise asymptotically normal. 


Apparently it has not previously been noted that randomness of the sequence 
{X,,-°-:, Xn} is not necessary for the validity of this proof. We will consider 
a number of alternatives under which the limit distribution of u-run statistics 
is normal. 

(a) We will say there is a linear trend if F = [[F‘” (x; — 6,), with 6; = 78. 
Then (1), (2) and (3) above will hold. Even if 6; is only approximately equal to 
76, we will still have asymptotic normality, although condition (2) will not hold. 

(b) If the scale of the distribution changes by a constant factor from one 
observation to the next, that is, 


(3.4) F™ = [[F¢ 
with 

™ 1 ) 
(3.5) FS? (x) = FS); (cx) (c > 0) 
we have normality in the limit. If the scale increases or decreases monotonically 
at less than this exponential rate, the limit distribution is the same as under Hg . 

(c) We will say there is a cycle of period p if 

nip 


. >(n) >( 
(3.6) F “= Il I (15-1) p41) T(j—1)p+2, °°*°* 
j=l 


A special case of this is 


(3.7) F™ = Il FY (2), 
1 
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with 
(3.8) FY = FS? (t =j mod p). 
Here again conditions (1), (2) and (3) hold approximately for large n and exactly 
if n/p is an integer. 

(d) If the X; satisfy a stable linear stochastic difference equation, for example, 


(3.9) Xiua _ BX; + U isa (| B | < 1), 


where the random variables U’; are independent and equidistributed, the methods 
used by S. Bernstein [6] to prove the Central Limit Theorem for Markov chains 
can be used to prove asymptotic normality. 

(e) The unstable stochastic difference equation X,,, = X; + U4, is of special 
interest since the exact distributions are known. They are the same as the dis- 
tributions of runs of two kinds of elements drawn from a binomial population 
which were given by Mood [7]. 

(f) If the marginal distributions of the X, are such that we would have asymp- 
totic normality if the X; were independent, the asymptotic normality will still 
hold under the weaker condition that X,,---, X,; are independent of X;, 
Xj41, °°: , X, for all 7 and 7 with 7 — 7 greater than some positive constant. 

It is clear that these special cases are not exhaustive, but they seem to cover 
the most interesting possibilities. If some other F‘”’ should prove of interest, it 
should be fairly easy to see whether the conditions for normality are fulfilled. 

4. The likelihood ratio statistic. Let p(t” | F‘”) be the elementary proba- 
bility of the sample point &”’ when F”’ is the true distribution. Let 

poe) a sup pe” | F™) 


and 
pot”) = sup ple” | F™). 


PF eQn 


Then the likelihood ratio statistic of Neyman and Pearson [8] is 


_ palé) 
(4.1) A= male)" 


In general this expression has no meaning in the nonparametric case. Wolfowitz 
[9] adapted it to the two-sample problem by considering only the sequence, B, 
of the ranks of the observations. For ¢'”’ a point in the space of permutations of 
B, p.(é”"') = 1/n!, and \ is equivalent to po(é”’). Wolfowitz was able to obtain 
an approximation to pe(é”’) and suggested its use as the test statistic. Unfortu- 
nately, under these conditions the randomness hypothesis Hy leads to p,(¢) = 
1/n! and po(t™) = 1 for every ¢”’, so that \ is a constant and cannot be used. 
Now suppose ~”’ is further restricted to the space of all possible sequences of 
signs of first difference, B*. For any rank statistic, po(”’) = 1; thus we always 
have \ = p.(¢). But now p.(é’) is no longer a constant, and we may take 
the critical region 


(n) 
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(4.2) W,(B*): pulé™) < e. 


If we give the sequence of runs up and down in order, with their lengths, we 
specify the sequence B*. The next step is to give only the frequency distribution 
of runs up and down. The following step is to group together all long runs and 
restrict ourselves to the space of the statisties B**: s; , s2, +--+ , 8», S'p41 , he, be, 

- , t,. But in the limit these have a joint multivariate normal distribution, 
so in the limit A is equivalent to 


- , 
(2(81, 825 °°" » Spy Spain hi, fe,- 
where for any set of random variables x2;, +++, 2, 


y 


(4.3) Qin, +++, a) = Dd, ofr; — E(xd][x; — E(x)] 
i,j=1 
with 
" 1 
Oi || - 
For our case we use the covariance matrix under Ho, and the critical region is 


(4.4) W,(B**):Q 2 C. 


Since |t — s | < 1, 44, need not be included in B**; if it were, || ¢:;|| would be 
singular in the limit. 

Intuitive considerations similar to those that originally led Neyman and 
Pearson to the likelihood ratio statistic suggest that W,(B*) is the “best’’ sta- 
tistic depending only on the sequence B*. It would then follow that W,,(B**) is 
less efficient; in other words, information has been lost in ignoring the long runs 
and the order of appearance of the runs. Still further information will be lost if 
runs up and runs down of the same length are combined and the statistics B***: 
ri, °** 4 Tp» Tai are used. While it is not practicable to use the region W,(B*), 
the region W,(B**) can be used. In a previous paper (Levene and Wolfowitz, 
[1]) the covariance matrix of the r’s was given. Because of the desirability of using 
the region W,,(B**) the covariance matrix of the s’s and ¢’s has now been com- 
puted and is given in the Appendix to this paper. Because of the weight of the 
formulas and the possibilities for error in substituting numerical values of p and 
g, the numerical values needed for tests based on 8; on s;, 82, f; ; and on 81, 8, 
83 , 4, , fg are given, as are a few additional values. These values have been checked 
by addition, using formulas of the type 


(4.5) a (r) = ao’ (s + 8s oe ty — ts), 


where the right-hand member is to be expanded as a sum of sixteen terms. The 
methods used in obtaining the covariances are similar to those used in Levene 
and Wolfowitz [1]. The covariances of k, the total number of plus signs in B*, 
with s, and s’, are also given. It can be shown as follows that under Hp , o(s,,k) = 
—o(t,,k), a(8, kj=- a(t’, k), and o(r,,k) = o(ry,k) = 0. We have o(r,,k) = 
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a(s,, k) + o(t, , k). But by symmetry, under Hy, o(t,, k) = o(s,, k’), where 
k’ = total number of minus signs in B*. Hence o(r, , k) = o(s,p,k) + o(s,,k’) = 
o(sp,k + k’) = o(s,,n — 1) = 0, since n — 1 is a constant. 

Although k is not independent of r; , --- , Tp, T’p4, under Hy it is uncorrelated 
with them, and since k and the r’s have a joint normal distribution in the limit, 
it follows that Q(k) and Q(r,, ---, rp, 7p41) are independently distributed in 
the limit as xj and x4,, respectively. Thus, for example, the \ statistic depend- 
ing only on k and r is 
(46) (k — B®)’ 4 (r — EW wel. 

o*(k) o*(r) 
This statistic is very easy to compute and use. A rough idea of the type of de- 
parture from randomness may be obtained from the relative size of the two 
components, since it can be shown, for example, that the test based on k is more 
powerful for linear trends and less powerful for certain cyclical trends than is 
the test based on r. 

5. The asymptotic power function. Under //» the exact distribution of u-run 
statistics is extremely cumbersome and impractical. For any alternative the 
exact distribution would be still more complicated, if, indeed, it could be ob- 
tained at all. Since we are thus constrained to use the asymptotic theory in any 
case, we may as well take advantage of this to introduce certain simplifications. 
Let F represent an infinite sequence {F} such that, for k < m, F™ (2,, --- 
wy) = F’™ (a.,+++,%e, ®,°**, ©). If uw and v are any two u-run statistics, 
then for Ho or for a number of important alternatives, for example, a linear 
trend, a cyclic alternative, or a stable stochastic difference equation, we have 


E(u) = na, + ae, 
(5.1) o*(u) = naz + %, 


, 


o(u,v) = nds + ae, 
where the a, are constants depending on F. Let 


E'(u) lim . E(u) 


n—-2 7 


ou) @ Yim eu) @ &, 


n—-2 


o’(u, v) = lim : o(u,v) = ds. 
nw~2 
Then for large n, E(u) ~ nE’(u), o(u) ~V/n o’(u), and o(u, v) ~ no’(u, v), 
where the symbol ~ means “is asymptotically equal to.” 
Furthermore, if / is such that the u-run statistics are normally distributed in 
the limit (see Section 3) the limits E’(u) and o’(u, v) will usually exist; and the 
asymptotic distribution of the u-run statistics is completely determined as soon 


; 
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as we have the E’(u)’s and the o’(u, v) matrix. For the remainder of this paper we 
consider only F of this type. 

Now suppose we consider the hypothesis Ho and a definite alternative hypothe- 
sis Hi: F = F; . Then let E(u) and oo(u, v) be the expected values and covariances 
under Hy , and E(u) and o}(u, v) be the corresponding values under H,. We can 
then compute the power of the test 

For concreteness suppose we have a linear trend: 


(5.3) F™ = TIF (zx; — 16). 


We first consider the test based on total runs r. Then E,(r) > E,(r). Suppose we 
use the lower tail of r as critical region. For size a the test will have power at 
least 1 — 8 if 


(5.4) E,(r) = Narr) > E\(r) + Ago, (1), 


l xz 
= [ et? dt = 
(/ wh J» 


a 


where 


and 


(5.6) z [ ce" dt = 
) c 


(5.4) may be written 
E\(r) — Ex(r) > Aaoo(r) + Agor(r), 
or, using the approximate values, 
(5.7) V/nlEo(r) — Ex(r)] > [Aaoo(r) + Agoi(r)). 
Since the terms in brackets depend on @ but not on n, the inequality will hold 
for large enough n for any fixed 6 > 0, and any A, and dg. In order to have a 
situation of statistical interest, it is necessary to let 6 ~ 0 as n — © in sucha 


way that /n[Eo(r) — E£4(r)] remains constant. Under these conditions o4(r) > 
o,(r), and consequently we may write (5.7) as 
(5.8) V nlEo(r) — Ex(r)] > (a + ds)o0(7). 
Thus for large n the power depends only on 
E(r) — Ex(r 
A(r) = E(r) = u(r) ° 
g(r) 


Similarly, if a two-tail test were used, we should find that the power of the test 
depended only on 
. E((r) — Ex(r)} 
(5.9) A*(r) = or) — Ex(r)) 
{oo(r)}? 
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We shall call A*(r), which is a monotonic function of the asymptotic power of 
the test, the asymptotic power index. 
Now let wu, +--+ , up be a set of linearly independent u-run statistics with co- 


. . " ce oe —. os —r . ° 
variance matrix || o;; ||. Let || o° || = |! o:; . Then the critical region is 


(5.10) Q = DL ov'[u: — Eo(u)llus — Eo(u)] > C. 


Again 
(5.11) Qrnd (oN k - E(u | E - Eu) |. 


In determining the distribution of Q under H, when n is large and for the cases 
of interest the matrix |! (o,;); || can be replaced by the null covariance matrix 
| (o:;)o || . We then have Q distributed under Hy as x; and under H, as a sum 


72 


- 


of noncentral squares x, with parameter 


(5.12) nd? = nd, (o"){Eo(u) — Ei(u) [Eo(u;) — E1(w)]. 


(See Tang [11] for the x’* distribution. Tang uses the parameters \ = n A’/2 
and ¢ = Av/n/(p + 1).) We will call A* = A*(w, --- , uy) the multivariate 
asymptotic power index. It is easy to see that A*(r) defined above is a special case 
of this. 

The above reasoning will hold for a number of other types of alternative as 
well as it does for the linear trend. We thus see that an investigation of the 
asymptotic power of u-run statistics in these cases requires only the finding of 
the E}(u), and we will consider ways of doing this in the next three sections. 
This situation is very fortunate for three reasons. First, it is much more labor to 
find the covariances than the expected values. Second, when the E£}(u;) differ 
from the E4(u,) and the (o;,;); differ from the (¢;;)o , the asymptotic distribution 
of Q becomes the distribution of an arbitrary quadratic form in normal variates, 
and is extremely difficult to handle. Third, we can now show that Q, recommended 
on intuitive grounds in the last section as the likelihood ratio statistic, has opti- 
mum properties. In the space of the u-run statistics (wu, --- , up), say, we are 
essentially testing the simple hypothesis that the variables (uw, ---, up), 
normally distributed with the covariance matrix oi; ||, have means 
(uy, --*, #y), against the alternative that the means are (u,, --- , up), 
with max | u$ — u;| = O(n"). For this hypothesis Wald [11] has shown that 
W,(a): Q > C has optimum properties. 

6. Expected values in general. So far we have only assumed that F was 
continuous. To obtain the expected values, we assume that the probability 
density function, f(z, ---, zn), exists. Then 


w~EL-£ 


ie If fan, t+ In) dre | dru) pel drs, 
L209 : ges 


Ziel 


(6.1) 
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where in each term of the sum the integration is to be from —« to + for 
every x; (j = 1, --- , n) except x, and 2,4, . For the remainder of this paper we 
will further assume that f°” = []/$”(z,), and will omit the superscript indicating 
dimensionality, writing the joint density function as 


(6.2) flr, ++*,4n) = I fix) 


Equation (6.1) then becomes 


n—2 ca 
(6.3) Els) ~ > | fera(eera) | 
inl by 


L. 00 


pti+2 re 


fiar(inn) | Fix) dt Ae41 Axi+2 . 
z 


Similarly we have 


n—1 


x Zinl 
(6.4) E(k) ~ > | fisa(viay) / fila) Ax; dris1 


im] tg oe 


and 
E(s',) = 


n— p—1 x Zi+p+i 
(6.5) > [ ferper(Zi+psr) [ fisp(x +p) wm 


=! 


(Seasons | 2 fd 


v« 2541 
X dij +++ dxtippei- 


For the linear trend, f;(7;) = f(x; — 76), all terms in the sum (6.3) are equal 
and we have 


ed x3 x 
(6.6) E’(s) = f(a3 — 38) / f(x. — 26) | f(a, — 6) dx, dxe dr;, 


2 


while for a cyclic alternative of period T we have 


T x Zi+2 x 
(6.7) E'(s) = ; z [ fi 10(X49) / Fiza (iar) [ fi (a;) de; A241 AX +2, 


i=l J 2 S241 


where E’(s) is defined by (5.2) as lim E(s). These simplifications hold for every 


u-run statistic. 


, 


We will deal only with E’(s’,) and E’(t,), since s, = sy — 841, ete. We also 
note that E’(s) = E’(t) = 3 E’(r), since! s — ¢| < 1, and that the distribution 
of t', for a sequence {X,} is the distribution of s’, for {—X;}. 

Even in the simplest possible case, a linear trend with 6 given, the value of 
E’(s), etc., depends on the underlying distribution f(z), which must be specified 
before we can integrate. We will obtain expected values for f(x) rectangular, 
and for the most important case, f(z) normal. 

7. Expected values for rectangular populations. Let 


f 


fie) = ois 10< 4; < i6+1, 
poe? elsewhere. 


\ 
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Then the integrations indicated in Section 6 can be performed explicitly. The 
only complication is the usual one associated with rectangular distributions: the 
integral must be broken up into a number of parts, each with a different inte- 
grand; and the enumeration of the possibilities rapidly becomes tedious as the 
number of integrations increases. We list a number of the simpler results. 


($1 + 0(2 — | @|)], —-1<@<1, 
(7.1) E’(k) =4 0, @< -l, 


| 1, @> 1, 
1(2 — 97 + 8! 0|"), |@| < 3, 


’ 


0 
(1 =r, | 6 r 6 
0, 9 
(7.3) E’(s:) = (3 + 120 — 186° — 526° + 750%), 0<6 
(7.4) E’(ts) = 3(3 — 120 — 66° + 766° — 816), 0<0e 


| 
! ’ 


(7.5) E’(r) = 1(1 — 40 + 46 — 6), 0<80 
It will be noted that for @ close to zero, which is the most interesting case, the 
test based on s; (or ¢2) is much more powerful than the test based on r:, since 


the asymptotic power indexes (see 5.9) are of order 6 and @ respectively. 
For one special case a simple general formula is possible, namely 


1 , 
goons fh om ( ley" o0<e< 
oro! (p + 1)6] SO<>5 


iii 
pt+il 


1 CL ageigi 
w+"! pe) 


I 


Bs) = Ve 


(1 — pe)?™, 6 


0, 
8. Expected value of k and s for normal populations. Let 


l 
ks je? 
(8.1) o(x) _ / 2x © 
and 
1 . 412 
(xr) = —=— | ec” dt. 
v = 


> 


We will consider in this section the alternative of a normal population with 
change of position, that is, 


(8.3) fi(2;) = (7; — ui) 


for some set of parameters u,(i = 1, --- , n). We can suppose without loss of 
generality that u, = 0. It will be enough to show how the first term of the sums 
in (6.3) and (6.4) ean be evaluated. Now 
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(8.4) [ o(r2 — | (x1) dx, dre 


is the integral of the circular normal distribution with center at (0, uz) over the 
half plane above and to the left of the line x; = 2. Evidently the distance from 
(0, u2) to this line is u2/+/2 and the integral (8.4) is equal to ®(u2/+/2). Thus 


(8.5) E(k) ~ 2®[(uias — wd) /V/2), 
and for the linear trend, n; = (7 — 1)@, we have 
(8.6) n'(k) = (0/+/2). 


Here, essentially, we rotated axes so that one variate was independent of the 
other and then integrated it out. This elimination of one variable can be done in 
the general case of the integration in (6.5). In particular, the evaluation of E’(s) 
reduces to evaluating the circular normal distribution over a region bounded by 
two half lines meeting in an obtuse angle. By a further linear transformation we 
obtain the relation 


(8.7) [ g(r; — us) (/ Q\r2 — pe) | $(21) ie) ar) dicz 
ba ie ze aa 


oo 2 
, —[1/2(1—p2)} (yj—2eu ywaty)) 
a K | [ e p vi PU Vary, dy; dy2, 
a b 
1 


where p = 3,4 = wsV/3 — wer tandb = — per 4. The right member of (8.7) 
is given in Table VIII, Vol. 2 of Pearson’s Tables [12]. 

For a linear trend, u; = (¢ — 1)6, E’(s) is given by the right member of (8.7), 
with a = 0+/3,b = — 6v/3. 

Table 1 gives values of 1 — E’(k), E’(s) and o’ (k) (see Section 10) for a linear 
trend with various values of @. Pearson’s table goes only to 0. 4 = 2.6, (d= 
3.676955); however, it will be noted that for @ > 2.8, E’(s) = 1 — E’(k) correct 
to five decimal places, and hence we can obtain E’(s) for @ > 2.8 by computing 
1 — E(k). The reason for this is that as 6 — » the number of minus signs be- 
comes small, and nearly every run down is of length one. For other values of 86, 
E’(k) can be obtained from a table of the normal integral, while E’(s) and a’ (k) 
can be obtained by interpolation in Table 1, using four-point formulas for four 
decimal places or six-point formulas for full accuracy. 

In Fig. 1, E’(s) is plotted against 6, the full line for f;(z;) normal and the broken 
line for f;(x,;) rectangular. In order to make these comparable, the rectangular 
distribution has been taken with unit variance (i.e., range = +12). It will be 
noted that the graphs are surprisingly close, suggesting that E’(s) is not very 
sensitive to changes in the form of f;(x;) for fixed mean and variance. 

9. Expected value of s’, for normal populations. For simplicity we will confine 
our attention to a normal population with linear trend 


(9.1) filzd) = O(ti — ws) (ui = 76). 


At the end of this section we will extend the method to the general case. 





TABLE 1 


Limiting values for a normal population with unit variance and linear 
trend py = 10 


6 E'(s) 1 — E'(k) o’*(k) o’(k) 
.000000 .333333 - 50000 -08333 . 289 
- 141421 .330590 -46017 -08405 .290 
. 282843 .322524 .42074 .08611 . 293 
-424264 .309601 . 38209 .08910 . 298 
. 565685 . 292542 -34458 .09244 304 
. 707107 . 272240 .30854 .09554 .309 


.848528 . 249673 . 27425 .09777 313 
. 989949 . 225818 . 24196 .09862 .314 
. 131371 .201577 .21186 .09779 .313 
. 272792 177722 . 18406 .09510 .308 
.414214 . 154873 . 15866 .09072 301 


. 555635 . 133483 13567 -08484 .291 
697056 . 113851 11507 .07779 .279 
.838478 .096143 .09680 .07000 .265 
. 979899 .080415 .08076 .06189 .249 
2.121320 .066635 .06681 .05378 .232 


2.262742 .054716 .05480 .04597 .214 
2.404163 -044526 -04457 .03869 .197 
2.545584 .035913 .03593 .03209 .179 
2.687006 .028708 .02872 .02628 . 162 
2.828427 .022747 .02275 .02120 . 146 


ie 
Bi 
Ba 
1..§ 
2. 


2.969848 .017863 .01786 .01689 .130 
3.111270 .013903 .01390 .01332 115 
3.252691 .010724 .01072 .01038 . 102 
3.394113 .008197 .00820 .00800 .089 
3.535534 .006210 .00621 .00609 .079 


to bh bo bo be 


3.676955 .004661 .00466 .00463 .068 
3.818377 .00347 .00344 .059 
3.959798 .00256 .00253 .050 
. 101219 .00187 .00187 043 
. 242641 .00135 .00135 .037 


W dO bw WwW bo 


ow 


.925483 .00069 -00069 .026 
.808326 .00034 .00034 .018 
5.091169 -00016 .00016 .013 
.374012 .00007 .00007 .008 
5.656854 .00003 .00003 .005 


Ww w 


_ 
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We have 


E'(s,) = [ dns — (p+ DO 


(9.2) ie * - 
° / O(rpi1 — pO) --- [ (x2 — 8) / (x1) dr, -- 


2 


ew 
ot 























Fic. 1. Value of £’(s) for linear trend: X; independent with unit variances and mean 1. 
Solid line denotes normal population; broken line denotes rectangular population. 


This could be reduced to a p-tuple integral geometrically as was done for E’(s) 
in Section 8, but it is easier to use a method due to Kendall [13]. Let 

ai Tina — Ty ° 
(9.3) ae (¢ = 1,---,n — 1) 


Then 
E'(s’,) 
(9.4) 


— ie" ygyy 


dyptr *** dip, 
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where 


0 


and for a linear trend uw; — wis3 = —86 for all ¢ < n. Kendall [13] was not in- 
terested in power function considerations, but was investigating runs for a 
different purpose. He did not consider a linear trend, but the case where the X; 
satisfy the stochastic difference equation 


(9.6) Nise + aXi4g1 + OX; = Vise, 


where the U are independently normally distributed with zero mean and unit 
variances. In his case all the n; = 0, but the matrix || o;; || has no zero terms. 
Kendall gave 1/E’(s) for certain a and b, and suggested evaluating the general 
integral (9.4) by the generalized tetrachoric series expansion given in Kendall 
[14]. For a general multivariate normal distribution the evaluation of this series 
is extremely laborious. For the linear trend, the many zero terms in the covari- 
ance matrix reduce the number of terms in the expansion, but the labor is still 
very great, and increases geometrically with p. 

An alternative way of evaluating the integral in (9.4) would be by numerical 
quadrature. This would involve computing and adding N terms, with N of the 
order of a”*' and a between 10 and 30 if reasonable accuracy were to be obtained. 
This would be very laborious. It is much easier to work with the integral in 
(9.2) and to evaluate it by repeated numerical quadrature. This will involve 
only (p + 2)a@ operations, and is the method we will actually use. 

There are many methods of numerical quadrature, from simple ones such as 
the trapezoidal formula and Simpson’s formula to relatively complicated ones 
involving many ordinates with different weights, and even ordinates spaced at 
irrational intervals. Any of these methods will give any desired degree of accu- 
racy if the function to be integrated is well behaved and ordinates are taken 
sufficiently close together, but the methods differ in the number of ordinates 
required for specified accuracy. In a specific situation the formula requiring the 
least amount of labor should be used. 

For our problem, the easiest method is the most elementary one, namely, the 
tangent formula. For a. ¢ integers, a = 4 (2ah — 1), x = 4(2th + 1), the tan- 
gent formula with remainder is 


| 
| 
| 


eRe SE Se eee) 
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z 3 
(9.7) / f(t) dt =h > fh) + > pr, 


where $(2jh — 1) < £; < 3(2jh + 1) (see Steffensen [15] p. 159). Thusif f” (x) exists 
and is continuous on the interval of integration, the error in using the tangent 
formula is of order h*, where h is the distance between ordinates. The advantage 
of the tangent formula for our purposes is that it gives the indefinite integral 
with no extra labor; for example, if we start with values of f(x) at x = 0, 1, 2, 

- , we obtain approximations to 7, f(t) dt at x = 4, $, 3, --- . These values 
are then used for the second integration, and so on. 

To illustrate the method used, consider the following expressions, where the 
variables of summation vary in steps of h: 


z==6 z x z 
(98) Gy) =h DY oe -8) | $(t) at| =[ (2-8) | $(t) at| ds, 
z—my—hi2 20 “9 a0 
where the symbol = means “approximately equal to’’, and 
z—h/2 


G(x) = h DY oly — 21Gi(y)| 


“ [ou - (fo — 6) | [. at | de) ay. 


Then G,(— ©) = E’(k), and G.(o) = E’(t). 

We will see later in Table 2 that these two formulas give values close to the 
exact ones even when fairly few points are used. However, for the longer runs the 
exact values are not available for comparison, so something needs to be said 
about the errors in this method. We confine our remarks here to the normal case. 

The first approximation made is the use of finite sums to represent infinite 
integrals. We have 


(9.9) 


x 5+ 
(9.10) [ lin ah dian [ g(x — p) dx = .00000 
— 36 — 5+ ps 


correct to five decimals. Since all our integrands are of the form ¢(z — y)-y(zx) 
with 0 < ¥(x) < 1, the error committed by using a finite range is always less than 
.00001, and the only question is whether our finite sums are sufficiently close to 
the corresponding finite integrals. We will now consider this question. 

There is one source of error in G;(x), due to summing instead of integrating. 
On the other hand, G:(x) has two sources of error: first, we sum instead of in- 
tegrating; second, the ordinates are themselves in error, because of errors in 
G(x). It thus seems at first sight that the errors accumulate and that only a few 
iterations can be performed safely. Fortunately this is not so. The author has 
shown in his dissertation [16] that the error after m numerical integrations due 
to the accumulated error in the ordinates used for the last summation is less 
than the error due to replacing integration by summation at the last step; that 
1s, nO great improvement in accuracy would result if the approximate ordinates 
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used at the last step were replaced by the corresponding exact values obtained 
by integration. Since this is so, it is only necessary to consider the error caused 
by a single numerical integration. 

Let G(x) stand for the result of the jth summation, let g;(x) = ¢(x — j@)Gj1 
(x), and let ¢;(x) be the error introduced at the jth summation. If we knew gj (x), 
we could obtain a bound on e,(x) from (9.7). Since the g(x) are no easier to 
obtain than the expected values we are looking for, we use the approximation 


(9.11) gj (x) = 2 Mat), 


where A f(x) = f(x + h) — f(x), and A‘f(x) = A[A*“f(zx)]. For a general analytic 
f(x), the fact that f”(x) was small on a tabulated set of points would not prevent 
it from being uncomfortably large at some intermediate point. However, we 
know that g;(x) represents a multiple normal integral and that neither the func- 
tion nor its derivatives have any sudden changes. Accordingly, if h is so small 
that g; (2) changes smoothly, we can be sure that the maximum second difference 
is close to the maximum of gj (x), and that it is safe to write 
h 


g 
(9.12) e(z) = = A’g;(th). 


This suggests the use of Gauss’s first summation formula (Steffensen [15], p. 104), 


13) | “ f(t) dt = h l > s(n) + p> Ky pay + WY Ky > 5G), 


j=a 


where 


1 J 17 ' 367 27859 
me a Ks = e7es0’ ** = — a64486400' ** 


24’ 5760’ 

and 6” 'f(z) is the (2v — 1)th central difference of f(z). By the same argument as 
above, the remainder term can be approximated by the first correction term 
omitted, provided the differences of the requisite order are changing smoothly. 
For evaluating E’(s’,) with @ = 4, h = 1 is too large for smooth differences, and 
successive orders of differences become large. However, with h = .2 the differ- 
ences change smoothly, and successive orders of differences rapidly become 
small. Consider for example G,(.5) for @ = 4. The uncorrected sum is .296610 and 
the successive correction terms from 9.13 are — .000,233, — .000,002, — .000,000,1. 
The value z = .5 was chosen because the second correction term here assumes 
its extreme value. Evidently all correction terms except the first can be ignored, 
and the error of G,(.5) is only 0.1%. Similarly, the maximum error of G,(z) is 
0.1% at x = 2, while the error of G,() is only 0.03%. 

Table 2 gives values of E’(k) and E’(s’,) for @ = 4 and — } obtained by setting 
h equal to 1, .5, and .2 and also some values obtained by using the first correc- 
tion term of (9.13). For h = .2 and 6 = 3 the corrected values are accurate to 
five decimal places and the uncorrected values to four. Furthermore, the error 
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actually decreases with repeated iteration (i.e., large p), although the percentage 
error increases. The uncorrected values for h = .2 and 6 = —} are probably 
slightly less accurate, since the second derivatives are somewhat larger. Never- 
theless, for the purpose of investigating the power functions it appears that un- 
corrected summation will give ample accuracy with h = .2. It should be noted 
that even with h = 1, where the errors in the G;(x) are large for intermediate z, 
the final values are surprisingly good. 

The method of repeated summation is of very general applicability. It can 
be used freely when the X; are independently normally distributed with vari- 
ances close to 1 and | uw; — ui-s| < 3. For more extreme variation it may be 
necessary to use the correction term or take h < .2. It seems to be easier to take 
more ordinates than to compute the differences and apply the correction term, 
but the latter course should be taken occasionally to obtain an idea of the degree 
of accuracy attained. The method may also be used when the X; are independent 
but not normal; however, in such a case the error would have to be investigated 
in the same way as we have done it here. 

10. Variance. We have seen in Section 5 that it is not essential to know the 
variances under the alternative hypothesis. However, if they should be desired 
they can be obtained by the same methods used in Levene and Wolfowitz [1]. 
The only difference is that whereas under Hy such probabilities as Prob {— +” 
— +} could be obtained explicitly as rational functions of p, they must now be 
obtained numerically for fixed p by the methods of the three preceding sections. 
In general this requires excessive additional work; however, there are two vari- 
ances which can be obtained as byproducts of the expected values. These are 


(10.1) o’ (k) = 3E’(k) — 3[E’(k)|* — 2E’(s), 
and 
(10.2) o’ (8) = 2E’(s) — 5{E’(s)|? — E’(s3) — E’(G). 


Since both E’(k) and E’(s) are tabulated in Table 1, Section 8, for a normal 
population with linear trend, it was possible to give o’ (k) in the same table. 
The surprising fact will be noted that o’ (k) is a maximum at @ = 1 rather than 
at @ = 0. For 6 = 0, the signs of adjacent differences have a negative correlation, 
and apparently a moderate trend tends to make the differences more nearly 
independent, thus increasing the variance of the sum, k, even though the variance 
of an individual difference is greatest at @ = 0. A similar condition holds for 
o’ (8). 

For the special case 6 = 3, we have o’ (k) = .09088 and o’ /s) = 05610, 
compared with o’ (k) = .08333 and o’*(s) = .04444 for @ = 0. We can then com- 
pute the exact asymptotic power of the tests. For the test of Hy against the one 
sided alternative @ > 0, the asymptotically most powerful tests based on k and 
s respectively are 


(10.3) 7 >a 





me 
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and 


n/3 oh 
/n — 


For the level of significance a = .05 and for power 1 — 8 = .95 we will then 
require approximately n = 50 observations for the k-test and approximately 
n = 517 observations for the s-test. Thus, for this alternative, the test based 
on k is about ten times as good in a certain sense as the s-test. 

For the sake of comparison, we find that the asymptotic power index defined 
in (5.12) is A*(s) = .02362. For a = 8 = .05 and a one-sided test we must have 
n d* = 10.822, leading to n = 459, compared to the correct value, 517. Thus we 
see that even for this considerable departure from Hy), the asymptotic power 
index gives us a correct general idea of the power of the test. 

The power of these tests will be compared with the power of tests based on 
runs above and below the median in a forthcoming paper, where cyclic alterna- 
tives will also be considered. 

Appendix. Covariance matrix of u-run statistics under H». When the sequence 
(X,,-°--, Xn) is random, the expected values are 


(10.4) 





fo) = , P+ 3 1 y+ 3p —p—4 
Be) = BG) « 186d = 22 e ~ , 
(p + 3)! (p + 3)! 
4 1 y »>— 1 
sf) = ay 4480) 4 AST - 22255 
(p + 2)! (p + 2)! 
and E(k) = (n — 1)/2. 

The exact covariances and selected numerical values are given below. Formulas 
not given below may be obtained by interchanging ¢ and s; thus o°(s,) = o°(t,) 
and o(s,, t,) = o(t,, s,). An exception to this rule is o(k, sp) = — o(k, tp») and 
alk, sy) = — a(k, £2. 


o(s,, 8) = ni— [1/(q + 3)\p + 3)!] [pq + 3q + 1) + pg(d + 7q + 11) 
+ p(3q° + 11q° + 3q¢ — 10) + (q° — 10g — 7)] 
— [2/(p+ q+ 5)! ((p + 9) + 9p + 9) + 23(p +g) + 14) 
+ [5po/(p + 3)!\[p* + 3p + 1} + f[1/(g + 3)(p + 3) Nip 
+3q + 1) 
+ po +7q 4+ 11) + pg + 7¢ + 9¢ — 14g — 18) 


+ p(3q* + 11q° — 14q° — 65q — 25) + (q* — 18q° — 25q + 4)] 
+ [2/(p + ¢+5)!Il(p +g)! + 10(p + gq) + 29(p + gq)” + 16(p + q) 
— 19] — [5,¢/(p + 3)![p’ + 3p’ — p — 4}}, 

where 6,, = lif p = qand = Oif p ¥ q. 
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n{— [1/(p + 3)\(p + 3)![2p° + 13p* + 24p* + (3p + 1)(p — 7)) 
— [2/(2p + 5)!][Sp° + 36p* + 46p + 14] + [1/(p + 3)!][p* + 3p + 1)} 
+ {[1/(p+3)!(p+3)!][p'(3p + 11)(p + 3 — p(28p’ + 101p + 50) + 4] 
+ [2/(2p + 5)!][16p* + 80p* + 116p’ + 32p — 19] 
— [1/(p + 3)!]lp* + 3p’ — p — 4}}. 
= n{— [I/(q + 2) + 2)" + 1) + p@ + 3q + 1) 
+ +a-)- 2/(p+q+ 3) +9 + 2 
+ [1/(@ + 2)I@ + 1} + {L1/@ + 2) + 2)! 
(pq +1) + wd + 3¢ +1) + pl + 3¢ — @ — 4) 
+@+q — 49-3) +2/P+q+3)N@+a+3p +9 +1) 
— [1/(G + 2) [4 + G — 1}, 
where G = Max (p, q). 
o°(s'p) = n{— [1/(p + 2)p + 2)!I(p + L(2p’ + 3p — 1)) 
— [4/(2p + 3) lp + 1) + L1/(p + 2) Nip + 1} + f[1/(p + 2) (p + 2)! 
[3p* + 8p’ + p’ — 8p — 3] + [2/(2p + 3)!|[4p° + 6p + 1) 
— [1/(p + 2)!\p* + p — 1}. 
o(8p, t,) =n{— [1/(q + 3)p + 3)'Jip'(@ + 3¢ + 1) 
+ pig + 11g + 27q + 12) + p(3q° + 299° + 77q + 48) 
+ (q° + 18q° + 68q + 59)] — [2/(p + q + 3)(¢ + 2)'(p + 1!) 
ip —q—-— 1] — 2/p +a t+ 5q + 3)! + DI 
+ [2/(p +4 + Da'p!l) + {11/@ + 3)p + 3) lp" + 3¢ + 1) 
+ pq + lq’ + 27q + 12) + p'(q‘ + 11g’ + 47q° + 84g + 40) 
+ p(3q' + 299° + 94q° + 123q + 45) + (q' + 18q° + 72q° + 89q + 16)] 
+ [2/(p+q4+3)q4+2)'pt+ DIM +q+ 2) (p-g- I) 
+ [2/(p +q+5)q+ 3)(p + Dip +9 + 4) 
— [(2/(p + q + 1)q'p'Iip + ql}. 
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ni— [I/q + 2)'p + [op +¢q+ 3p + D@ + DI 

+ [2/(p + @ + La!p!] — [1/(p + a + 3)0(q + 2)'(—p + 2)5 
((p + 1)\(p + 2) + (¢ + Ig + 2))} 

+{L1/(@ + 2)Mp + 2)!p'@ + 1D + pg’ + 5q + 4) 

+ p(y + 5q° + 7¢ + 2) + (G + 40 + 2% — 3)] 

— (2/(p +q + lg'p'llp + a] + U/(p +9 +3) @ + 2)p + 2)I 
(p +q+2) (p+ LI(pt+2) 4+ @t+ VD + 2). 


n{—[l/(p + 3)"q + 2) 'I{p'(q + 1) + p'(q + 84g 9) 
+ p(3q° + 17q + 24) + (q° + 8q + 19)| + [2/(p + + Lp'q!] 
+ (I/(p+q4+ 2p + IG + Yi[p — a] + (1/(p + g + 3) 
(p + 2)Mqg + 2)!II(p — ¢ + Iq + 2)] + [1/(p + @ + 4) 
(p + 3)q + 2)![(p + 2p + 3) + @ + D@ + 2I} 
+ {{I/(p + 3)"q + 2)!Ip'(q + 1) + pig’ + 8q + 9) 
+ plq + 8¢ + 24¢ + 29) + p(3q + 17¢ + 34g + 43) 
+ (q¢ + 8¢° + 2lq + 29)| — 2/(p +a + Dpa'ilip +4) 
— (/p+q+ 2p + Iq + DYiIfp’ + 2 -— ¢g + 2I 
— [I/(p+ q+ 3)(p + 2q + 2)N(p + 2i(p + 2)(a +3) — 1 
— qq # 1(q + 2)]) — (/@M tat 4)(p + 3)Mq@ + 2)3) 
(p+q+ 3)[(p + 2)(p + 3) + (¢ + Dig + 2)II} . 


n{—[l/(p + 3)"¢ + 2)!IP'a + 1 + p@ + 5¢q + 3) 
+ p(3q + 5g — 1) + ( — 2g — 4) — R/p +a +4 

[(p +g) + 5p + q) + 5) + [n/(p + 3) lp + 3p + 1} 
+i @+ 3BMq + Die + YD + pg + 5q + 3) 
+ pl¢d + 5¢ + 2¢ — 5) + p(3q¢ + 5g — 15g — 16) 
+ (q° — 2g? — Iq — 4)] + [2/(p +9 + 4)![(p +g) + 6 + 9g) 
+ 8(p + 9) — 1) — [nlp + 3)!I[p' + 3p — p — All, 


where npg = Lif p > qand = Oif p < q. 


o(m, s>) = nf{l/2(p + 3)'J[(p — L(y + 3p + DI] + [1/2(—p + 3)}] 
[—p' — 2p’ + 5p’ + 7p — VJ 
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o(m, 8») = n{1/2(p + 4)!|[p' + Sp’ + p®? — 14p — 4] + [1/2(p + 4)]J 
[—p* — Sp‘ + 3p* + 34p” + 20p — 12). 
Numerical Values 


o'(q) = 94563n — 58747 
0160” 907200 ’ 


o°(s) me 2n a 2 o°(s,) io 369n — 191 
aeecis WET) we ee 


45 
11824n = 27551 dcekw —989n + 319 
453600 ; oe 20160 = 


’ —509n + 499 ’ —269n + 589 
o(8,, 8) = ——= ’ a(s;, 83) = — 70080.” 


6720 
—7099n + 22016 cilia wa —82n +233 
453600 , are 5040.” 

35n — 41 na 1427n — 3333 

3360” — 20160 =” 


lin — 121 —3457n — 15112 
o\s1, te) = sie a(S82, ty) = —907200 —, 


2880” 

—1lin — 391 —47n + 114 

20160 : 5040 , 
8n — 37 


—4148n + 4987 
180 


(2, 8) = ———9o7999 >? 
yf —82n + 163 y —309n — 29 
(81, b3) = 040? o(82, ts) = — 30160.” 
—2152n + 3833 
907200 E 


2843n — 1525 


, 
a\82 > 83) 


, ’ 
a\s2, 83) - 


o(8;, ts) o(8;, ts) a 


a(81, ti) = 


, 


—3d > 7 ’ ’ 
e 70 s o(s3, ta) = 
—lin + 39 
120 : 


12 "» o(k, s;) = 


in + 19 
_) > 


l 
k,s) = -, 
os, 8) = 3 


’ lin + 1 19n — 35 
(kK. So) = ———— k, 83) GR aihicwen 
o(k, 82) 0.’ " 360 
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ON THE STRUCTURE OF BALANCED INCOMPLETE 
BLOCK DESIGNS' 


By W. S. Connor, Jr.’ 
University of North Carolina 


1. Summary. In this paper there are developed for the first time analytical 
methods for the investigation of the structure of unsymmetrical balanced in- 
complete block designs. Two unsymmetrical balanced incomplete block designs 
are proved to be impossible, and for such designs in general, inequalities are 
found for the number of treatments common to two blocks. 

2. Introduction. In the balanced incomplete block design v varieties or treat- 
ments are compared in such a manner that each treatment is assigned to r ex- 
perimental units. The units themselves are arranged into b more or less homo- 
geneous blocks, each containing k experimental units. Any two treatments are 
required to occur together in the same block \ times, the treatments occurring 
in a given block being all different. Hence the design depends on the five pa- 
rameters, v, b, r, k, \. Clearly the following conditions are necessary: 


(2.1) bk = vor, r(k — 1) = (wv — 1)A. 


Fisher [1] also showed that for the existence of an actual combinatorial solu- 
tion it is necessary that 


(2.2) b>v,ork <r. 

The work of Yates [2], Fisher and Yates [3], Bose [4], and Bhattacharya [5], 
[6], [7] provided solutions for all of the balanced incomplete block designs with 
r < 10, except the designs shown in the following table. 

TABLE I 


Sl . Parameters 
Reference number in Fisher and Yates’s 


1938 table 


| 
»~ i 





(8) 
(10) 
(12) 
(14) 
(28) 
(30) 
(24) 
(31) 


— ee DO bo bh hk hb bo 


Hussain [8], [9] proved the nonexistence of the designs (10) and (14), Nandi 
[10] showed the impossibility of (8), and Shrikhande [11] proved the nonexistence 


1 This work was sponsored by the Office of Naval Research. 
2 Now with the National Bureau of Standards, 
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of (30). Chowla and Ryser [12], in a sequel to a paper by Bruck and Ryser [13], 
gave general results, of which the impossibility of (10), (14) and (30) are special 
cases. (Also see Schiitzenberger [16].) 

Of the designs listed in Table I there remain to be examined (12), (28), (24), 
and (31). It is the object of this paper to show that (12) and (28) are impossible, 
and to give a proof alternative to Nandi’s of the impossibility of (8). The investi- 
gations will incidentally throw much light on the structure of balanced incom- 
plete block designs in general. 

Before proceeding further, it is desirable to establish firmly that proofs of the 
impossibility of (12) and (28) are really needed. Designs which have v = b and 
r = k are called ‘symmetrical’ designs. Associated with every symmetrical 
design is a “‘derived”’ design, which has the following relation to the symmetrical 
design. If the parameters of the symmetrical design are v, b, r, k, and A, then the 
parameters of the derived design, which are indicated by asterisks, are 


(2.3) v* =v —?T, b* = b — I, r*=rT, k*=k-—}, A* = X. 


If a solution of a symmetrical design exists, then a solution of the derived design 
may be obtained by deleting a block and all of the treatments in the block from 
the symmetrical design. Such a solution of the derived design is said to be “‘ad- 
joinable,”’ since the symmetrical design can be built up from it by suitably ad- 
joining /: new treatments, \ to each block, and a block consisting of the new 
treatments. There do exist, however, in certain cases nonadjoinable solutions 
for the class of designs given by (2.3). 

An instructive example is due to Bhattacharya [5]. Associated with the sym- 
metrical design v = b = 25,r = k = 9, = 3, is the derived design v = 16, b = 
24,7 = 9,k = 6, = 3. In this case a solution exists for the symmetrical design, 
and hence there exists an adjoinable solution for the derived design. Since it is 
known that every two blocks of a symmetrical design have \ treatments in 
common, it follows that no two blocks of an adjoinable derived design can have 
more than \ treatments in common. If a solution exists for the derived design 
which contains two blocks which have more than \ treatments in common, then 
clearly the solution is nonadjoinable. Bhattacharya gave the following solution 
of the derived design for the above case which contains two blocks (starred) 
which have four treatments in common, and two blocks (underscored) which 
have zero treatments in common. 


1, 2, 7, 8, 14, 15) (3, 5, 7, 8, 11, 13) (2, 3,8, 9, 18, 16) 
9, 12, 14) (1, 6, 7, 9, 12, 13)* (2, 5, 7, 10, 13, 15) 
(3, 4, 6, 13, 14, 15) (4, 5, 7, 9, 12, 15) 

(2, 4,9, 10,11,18) (8, 6, 7, 11, 14) 1, 2, 3, 4, 5, 6) 


(1, 4, 7, 8, 11, 16) 2,4, 8, 10, 12, 14) (5, 6, 8, 10, 15, 16) 
(1, 6, 8, 10, 12, 13)* (1, 2,: 12,15) (2, 6, 7, 9, 14, 16) 
(1, 4, 5, 13, 14,16) ¢; , 16) (1, 3, 9, 10, 15, 16) 
4, 6, 8, 9, 11, 15) 11, 14) (11, 12, 13, 14, 15, 16). 
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The above considerations show that the existence of a symmetrical design 
implies the existence of the corresponding derived design. Also the nonexistence 
of a derived design implies the nonexistence of the corresponding symmetrical 
design. But the nonexistence of a symmetrical design does not imply the non- 
existence of the corresponding derived design, since a nonadjoinable solution 
may nevertheless exist. In particular the nonexistence of designs (14) and (30) 
of Fisher’s tables does not rule out the possible existence of nonadjoinable solu- 
tions for (12) and (28). In the next section there will be established a fundamental 
theorem which besides being useful for establishing the impossibility of the two 
last mentioned designs, has intrinsic interest in as much as it gives a helpful 
insight into the structural nature of balanced incomplete block designs. 

3. A fundamental theorem. Before considering the theorem, we shall prove 
the following useful Lemma. 

Lema 3.1. If | A | is the determinant defined by 


a 8 738 C1,041 


7 €2,r4+1 


Cr+t,1 Cr4t,2 °° *Cv+tev Co+t.v+3 


then 
(3.2) A| = [a + (v — 16) "(a — B)” "| Beh, 
where B, is of order t X t, and the elements of B, are 


bie = (a + (v — 1)B)(@ — Blerajrge — (a + (v — 18) 


v y v 
- 7 Ci,v+u Cotj,i + B Z Civtu p Co+j,i + 
i=l =! t=1 


To prove the lemma let the following operations be carried out on the rows 
and columns of | A |: 


(i) Multiply the last ¢ columns by 


la + (v — 1))[a — 8}, 


and write an offsetting factor outside. 

(ii) Add rows 1, 2, --- ,» — 1 to row v. 

(iii) Take the factor [a + (v — 1) B] out of row v. 

(iv) Multiply row v by 8 and subtract this product from rows 1, 2, --- 
v-— 1. 

(v) Take the factor (a — 8) out of rows 1, 2, ---,v — 1. 
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(vi) Subtract rows 1, 2,---,v — 1 from row v. 

(vii) Subtract suitable multiples of columns 1, 2, --- , v from columns v + 1, 
v + 2,---,v-+ ¢so as to make the elements which are both in the first v rows 
and also in the last ¢ columns equal to zero, and the lemma follows. 

Consider the ‘‘incidence” matrix N of the design, that is, 


my *°: wr 
(3.3) N = : 


. ’ 
Ny °°: = 


where the rows represent treatments, the columns represent blocks, and n,; = 
1 or 0 according as the ith treatment does or does not occur in the jth block. 
Since every treatment is replicated r times, 


b 
(3.4) > nj = 1, ; -++ 0), 
j=l 


and since every treatment must occur A times with every other treatment, 


b 
(3.5) Zz NijNuj = A, (jju=1,-+-,o, tu). 


j=! 


Hence, 


where N’ denotes the transpose of N. Clearly, 
(3.7) | NN’ | = rk(r — d)"". 

Choose any ¢ < b blocks of the design. Let the submatrix of N which corre- 
sponds to these t blocks be denoted by Ny. Let $;, be the number of treatments 


common to the jth and uth chosen blocks (j, u = 1, 2, --- , t). Then thet X ¢ 
symmetric matrix 


(3.8) S. = NoNo = (Sju) 


is defined to be the structural matrix of the t chosen blocks. The jth row or column 
of S;, corresponds to the jth chosen block and the successive elements of the 
jth row or column give the number of treatments which this block has in com- 
mon with the Ist, 2nd, --- , éth chosen blocks. 

Let the columns of N be permuted so that the first ¢ columns correspond to 
the ¢ chosen blocks. Then let the incidence matrix be extended by adjoining ¢ 
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new rows, so that the jth adjoined row consists of zero elements except the jth, 
which is unity. We thus get 


, N 
3.9) Ni = 
( 1 * 0 |; 


where J, is the identity matrix of order ¢t, and 0 is the t X (b — ¢) zeromatrix. 
Then 


, [NN’ No 
(3.10) NW; = ’ , 
No I; 


By application of Lemma 3.1, we obtain 


(3.11) |NiNi | = kerr — v1 Ci | 


ti» 


where 
ej, =(r—k)(r—A), Cin = AK — r85u 


(J ¥ u), (3, » = I, °°" > t), 
and (2.1) has been used in replacing (r + (v — 1)A) by rk. 
The matrix C; given by (3.11) is a symmetric matrix whose elements are in 
(1, 1) correspondence with the elements of the structural matrix S, of the chosen 
blocks.. In fact we can write 


(3.13) C, = ARE, + r(r — AVI: — Si, 


(3.12) 


where £; is the singular ¢ X ¢ matrix all of whose elements are unity. 

The matrix C;, is defined as the characteristic matrix of the t chosen blocks. The 
jth row or the jth column of C; corresponds to the jth chosen block of the design. 

When P is a matrix with real elements of order s X t, t > s, it is well known 
that | PP’ | > 0. Hence if b > v + t, then | N,N | > 0. Further, since the ele- 
ments of N, are integers, if b = v + ¢t, then NiNi | is a perfect integral square. 
Finally if b < v + é, then N.N1| = 0. Hence we get the following fundamental 
theorem. 

THEOREM 3.1. If C, ts the characteristic matrix of any set of t blocks chosen from 
a balanced incomplete block design with parameters v, \b, r, k, \ then 

(i) C:|>0t<b— x, 

(ii) C,;| = Oift > b — v, and 

(iii) (ry 7" (Cr — dO *| Cy, | is @ perfect integral square. 

To illustrate the kind of information which is contained in this theorem, con- 
sider the design with parameters r = 9, k = 6, b = 24,v = 16, and \ = 3. Let 
the treatments be denoted by letters, and consider whether it is possible to fill 
up four blocks in such a way that each block will have three treatments in com- 
mon with each of the other three blocks. One way in which this can be done is 
as follows: 


(3.14) (ABCDEF), (ABCGHI), (ABDGJK), (ACDGLM). 
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We can now ask whether these four blocks can form part of the completed 
design. To answer this question, apply Theorem 3.1. Now, 


Cj = (r-A)(r—k) = 18 


, 


and 


Cju = AK — r8Sy, = —9, 


Hence, 
C4 | = 248 (—9) < Q, 


and by (i) of Theorem 3.1 it follows that (3.14) is impossible, and in fact that 
any set of four blocks of the type considered cannot form a part of the com- 
pleted design. 

Now we shall indicate some simple consequences of Theorem 3.1. By letting 
t = 1 it is easy to prove Fisher’s inequality (2.2). By letting t = 2 we obtain 


(3.15) C.| = (r — A)*(r — k)* — Ak — Spr)? > 0, 
whence we obtain the 
: : ] ; , 
Coro.uary 3.1. -[2Ak + r(r — A — k)] > Su > — (rf — A — k). For the 
r 


symmetrical designs, that is, the designs with r = k, it follows from Corollary 
3.1 that S;, = A, so that any two blocks of such a design have exactly \ treat- 
ments in common, a result which was first noticed by Fisher. For example, for 
the design with parameters r = 9, k = 6,b = 24,v = 16, and \ = 3, Corollary 
3.1 gives the bounds 


(3.16) 4> 32>, 


and Bhattacharya’s solution (2.4) given before actually contains two blocks 
(starred) which have four treatments in common, and also another two blocks 
(underscored) with no treatments in common. 

4. The structure of balanced incomplete block designs of the series v = 
tkh(k + 1),b = 3(R+ 1)(kR+ 2),7r = kR + 2, and = 2. It is the object of 
this section to develop several lemmas about the relations between blocks of any 
design belonging to this series. The first two lemmas do not depend on Theorem 
3.1, but subsequent lemmas are based on it. 

Consider an initial block B,, which contains the k treatments a;,--- , a; . 
It is desired to know how the a; are distributed among the remaining (6 — 1) 
blocks. Let there be n,; blocks which contain zi of the treatments a;. Then the 
following relations are necessary: 


(i) Do ny = b—1 = 4k(k + 3), 
t=0 


(ii) >> in; = k(r — 1) = k(k + 1), 


k= 
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k 
(iii) SS ii — 1)n; = k(k — 1). 
t=O 


Consider 


(4.2) Q= > (i — 1) — 2)n, 


t= 


where n;, (¢ = 0,--+-, k), is a positive or zero integer. Now 


I k k 
(4.3) Q= Dd ili — Dn, — 2D in, +2 Dn; = 0. 


i=0 i=f t=) 
Since z > 0 and n; => 0, it follows from (4.3) that each term of Q is zero. Hence 
(4.4) n; = Ofori = Oandk >1> 2. 


From (4.4) and (i) and (ii) of (4.1) we obtain 

Lemma 4.1. Any block of the design has two treatments in common with $k(k — 1) 
other blocks, and one treatment in common with 2k other blocks. 

Next consider two initial blocks, B; and B:, which contain treatments as 
tollows: 
Byi0, +++ Oya, +++ yy 


(4.5) Bo:0, -++ Oyby ++ bey” 


The treatments 6; (¢ = 1, --+,¥;y7 = 1, 2) are the y treatments which B,; and 
B, have in common. It is desired to determine how the treatments of B,; and 
B, may be distributed among the remaining (b — 2) blocks. 

The remaining (6 — 2) blocks are of several types depending on how the 
treatments of B, and B: occur in them. It y = 2, the 6, and 6 occur together 
twice in B; and B, and cannot occur together again in any other block. The types 
of blocks are defined in 

DerinitIon 4.1. Type 1. The block contains two treatments from each of B, 
and Bz. It is of subtype 11 or 12 according as one 0; does not, or does occur as one 
of the two treatments. 

Type 2. The block contains two treatments from one of B, and Bz, but only one 
treatment from the other. It is of subtype 21 or 22 according as one 6; does not, or 
does occur as one of the treatments. 

Type 3. The block contains one treatment from each of B, and B,. It is of sub- 
type 31 or 32 according as one 6; is not, or is the treatment. 

Consider the pairs which must be formed among the treatments of B, and Bz. 
Certain pairs occur in B; and B., leaving the following pairs to occur in the 
remaining (b — 2) blocks: 


Type of pair Number of pairs 


a,b, = 2(k — >)? 
a,a; or byb, 2=(k-—y)(k-y-1) 
6a; or Ob, 3; = 2y(k — ¥) 
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Denote the number of blocks of type li (l = 1, 2, 3; 7 = 1, 2) by x; 
from the above considerations the following equations are necessary. 


(a) 4a + Xe + 272 + 2 =n, 

(b) 2a + In = Ne, 

(c) 2212 + Xx = 73, 

(d) X12 + X22 + 2 = yk, 

(e) 4a, + 222 + 3rq, + Xe + 273 = 2(k + 1)(k — y), 
(f) au + e+ Yn + Fe + Fa + Te = Fk(k + 3) — 1. 


The equations of (4.6) may be solved to determine the number of blocks of 
types 11, --- , 32. Then remembering that ecu is the number of blocks of 
type 1, we obtain 

LemMaA 4.2. With respect to 2 initial blocks which have y, (y = 1, 2) treatments 
in common, there exist [((k — y)(k — y + 1) + k-y — 4k(k + 3) — 1) blocks of 
type 1, 2y(k — vy) blocks of type 2, and [k(2 — y) — 2y(1 — y)] blocks of type 3. 

Now consider several structural matrices for 5 blocks. The first structural 
matrix to be considered is 


which is a symmetric matrix. The element $4 is unknown, and it is desired to 
know what values are admissible for $4, , if the five blocks which have S;” for 
their structural matrix are to form a part of the completed design. Of course, 
the admissible value is 1 or 2, or both. 

Associated with S;” is the characteristic determinant, | C§”| . Consider the 
elements of C3” . For the series of designs under consideration, r — k = 2 and 
r—Xr=k. Hence, c;; = 2k andcj, = k — 2 or —4, according as $;, = 1 or 2, 
where j and u refer to the jth and uth blocks of the set of 5 blocks being con- 
sidered. The element cy; is unknown and it is desired to know whether k — 2 or 
— 4 or both are admissible for cy; if the 5 blocks being considered are to form a 
part of the completed design. 


. ta : 
Evaluation of | C5” | by Lemma 3.1 yields 


(4.8) 1Cs? | = 2(k + 2)°(2k — cas)[2(k — 1)cus + (hk? — 28)]. 


Now by Theorem 3.1 it is necessary that | Cs” | > 0. If «4s; = —4 we obtain 
(4.9) k-— 10> 0, 


whence it follows that cy; cannot be —4 and hence $4; cannot be 2 unless k > 10. 
If cy; = (k — 2) we obtain 


(4.10) k-—4>0, 
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whence it follows that cs; cannot be (k — 2) and hence $4 cannot be 1 unless 
k > 4. These results are contained in 
Lemma 4.3. 
(i) If k < 4, then there cannot exist 5 blocks with S$” as structural matrix. 
(ii) If 4 < k < 9, then in Ss”, Ss = 1. 
(iii) If k > 10, then both values of S45 are admissible in a”. 
Let the second structural matrix to be considered be 


(4.11) 


Using Lemma 3.1 we find that the value of the determinant of the associated 
characteristic matrix is 


(4.12) | Cy? | = 4(k + 2)°(2k — cas)[(k — lew + 2(k — 4)], 


i és ‘ ‘ i ov 
which by Theorem 3.1 is nonnegative. Reasoning us for S;” 


Lemma 4.4. 

(i) If k < 3, then there cannot exist 5 blocks with S;” as structural matriz. 
(ii) If k > 3, then in S”, Ss = 1. 

Consider a third structural matrix 


we obtain 


(4.13) 


Using Lemma 3.1 we find that the value of the determinant of the associated 
characteristic matrix is 


c® = 4k + 2)"[—(k _ 1)cis 
— (k — 2)(k + 8)ew + (k — 2)(K* — k — 18)], 


which by Theorem 3.1 is nonnegative. Reasoning as for Ss”, and observing by 
placement of the treatments in the blocks that the design with k = 2 cannot 
contain five blocks with the structural matrix S;”, we obtain 

Lemma 4.5. 

(i) If k < 3, then there cannot exist 5 blocks with SS as structural matriz. 

(ii) If k > 3, then in Sy”, Sus = 2. 

5. The impossibility of balanced incomplete block designs (8) and (28) of 
Fisher and Yates’s table. In this section the proof of the impossibility of the 
designs (8) and (28) of Table 1 is completed. These designs belong to the series 
considered in Section 4 and correspond respectively to k = 5 and k = 8. 


(4.14) 
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For t = b we obtain from (3.8) the structural matrix S, of the design. From 
Lemmas 4.1 and 4.2 it follows that there exist two rows (blocks) of S, which 
are as follows: 


where the partitions break up the matrix S, into submatrices A, B, C, D, and 
E, in left to right order. According to Lemma 4.1, rows 1 and 2 both contain 
4k(k — 1) 2’s and 2k 1’s. Now since blocks 1 and 2 intersect in one treatment, 
it follows from Lemma 4.2 that there exist 4(k — 1)(k — 2) blocks of type 1, 
2(k — 1) blocks of type 2, and k blocks of type 3. Hence, it is necessary that A 
contain 3 columns, that B, C, and E each contain (k — 1) columns, and that D 
contain 4(k — 1)(k — 2) columns. 

Consider how the third row of S, may be filled up. By Lemma 4.1 it must 
contain 4k(k — 1) 2’s and 2k 1’s. Since block 3 intersects block 1 in one treat- 
ment, it follows by considering blocks 1 and 3 as initial blocks that the number 
of blocks of types 1, 2, and 3 must be as given in the preceding paragraph. Also 
block 3 intersects block 2 in one treatment, so the same result holds for blocks 
2 and 3 as initial blocks. Unfortunately these conditions are met by numerous 
arrangements of the 1’s and 2’s in row 3. In fact, it follows from Lemmas 4.1 
and 4.2 that if there are (k — 7 — 1) 2’s in row 3 of B, then there are 7 2’s in 
row 3 of C, [3(k — 1)(k — 2) — j] 2’s in row 3 of D, and j 2’s in row 3 of E, 
(j =0,---,k—D). 

Consider S;,42, the structural matrix for the following (k + 2) blocks: the 
blocks of A, the 7 blocks of C which have 2 in row 3, and the (k — 7 — 1) blocks 
of E which have i in row 3, that is, 


where F and H have k in the main diagonal, and are symmetric matrices. The 
other elements of F, G, and H are so far unknown and will be determined below. 
Comparison of the structure of S,.2 with the structures of S;” of (4.7), S$” of 
(4.11), and S$” of (4.13) shows that Lemmas 4.3, 4.4, and 4.5 apply. Hence the 
elements of F are 1, the elements of G are 2, and the elements of H are 1, for 
k < 10. 


Corresponding to S;42 is the characteristic matrix C,42 . It is useful to compute 
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(5.3) Cre _ = 


(2k) (k — 2) (k — 2)i(k — 2) ree (k — 2):(— 4) 
(2k) (k — 2):(— 4) oe (— 4) i(k — 2) 


(Qk) (k —- 2)--- (k —~ 2)\(— 4) 
‘(k — 2) (2k) ++ (k — 2)! 


‘ 
' 


i(k — 2) (k — 2) ++. 


Using Lemma 3.1 repeatedly we obtain 
(5.4) Crse| = jG — k + 2)(k — 6)(k + 2)*7. 


Now by (ii) of Theorem 3.1, | Cys2; = 0. From (5.4) it is clear that | Ci42| = 0 
when and only when j = 0 or (k — 2), ork = 6. 

Let k = 8. Then from (5.4), either 7 = 0 or j = 6. If 7 = 0, then consider 
Ss” for blocks 1, 2, 3, and any 6 blocks of E of 5.1. Then from (3.11), 
(5.5) (NY?)(N?) i 988 


where the 9 chosen blocks of N{” are the blocks which have S,” as structural 
matrix. 

If 7 = 6, then consider S,° for blocks 1, 2, 3, and the six blocks of C of (5.1) 
for which the third row contains 2. Then 
(5.6) | (NYP )(NY?Y’ | = 2, 
where the 9 chosen blocks of N{” are the blocks which have Sj” as structural 
matrix. 

The determinant | (N;")(N;") |, 7 = 1, 2, is not, a perfect integral square. 
But from (iii) of Theorem 3.1, it must be a perfect integral square. Hence the 

THeoreM 5.1. The balanced incomplete block design with parameters r = 10, 
k = 8,b = 45, v = 36, and X = 2 is impossible. 

Although a similar argument might be given for k = 5, an easy proof is as 
follows. Consider 
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where $4 is unknown but is either 1 or 2. The corresponding characteristic 
determinant is 


(5.8) 1 Cyg| = (7)(10 — c34)(13 cg + 38). 


By Theorem 3.1, it is necessary that | C,| > 0. It follows that cy = k — 2 = 3. 
Hence, $4 = 1 and blocks 1, 3, and the four blocks of C of (5.1) have the struc- 
tural matrix 


The corresponding characteristic determinant is 
(5.10) \C.| = 

and from (3.11), 

(5.11) |NiNi| = 5”, 


where the 6 chosen blocks of N; are the blocks which have S, as structural 
matrix. The determinant | N,N’; | is not a perfect integral square, which contra- 
dicts (iii) of Theorem 3.1. Hence, the 

THEOREM 5.2. The balanced incomplete block design with parameters r = 7, 
k = 5,b = 21, v = 15, and X = 2 is impossible. This result was obtained by 
Nandi [10] by a different method. 

6. The impossibility of the balanced incomplete block design (12) of Fisher 
and Yates’s table. This design is the member of the series of section 4 which has 
k = 6. From (5.4) it is seen that (k — 6) is a factor of | Cio! . Hence the argu- 
ment used for k = 8 will not apply for k = 6. 

Consider (5.1) in which two rows of S, are given. Assume that there do not 
exist five blocks having for their structural matrix 


This assumption will be contradicted. For the assumption to be true it is neces- 
sary for row 3 of C to contain exactly three 2’s. For if it contains less than three 
2’s then it contains at least three 1’s, which we may for definiteness take to be 
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in columns 1, 2, and 3 of C. But then blocks 1 and 3 of A, and 1, 2, and 3 of C 
form S;°, by Lemma 4.4. If row 3 of C contains more than three 2’s, then by 
Lemma 4.3, blocks 1 of A and any 4 blocks of C which have 2 in row 3 form 


gs.” Hence, there exist three blocks such that the first three rows of S, are as 
shown below. 


Denote the element in row 7 and column j of submatrix B of S, by (7, 7). Then 
for Ss* not to exist, it is necessary in B that 


(6.3) (4,2) = (4,3) = (5, 3) = 2. 


But then blocks 1, 2, 4, 5, and 6 of S, form ss” with S4, = 2, which contradicts 
Lemma 4.4. Hence, the 

Lema 6.1. If the design exists then there exist five blocks having S,° of (6.1) 
for their structural matrix. 

Without loss of generality, let S;° be the leading principal minor matrix of 
order 5 in S,. Let S, be partitioned as in (5.1). Then row 3 of B contains at 
least two 1’s and cannot contain more than three 2’s. Hence, row 3 of C cannot 
contain fewer than two 2’s. If row 3 of C contains u 2’s, then by Lemma 4.2, 
row 3 of E contains (5 — u) 1’s, (u = 2, --- , 5). 

Case 1. Row 3 of C contains either two or three 2’s. Then row 3 of E con- 
tains at least two 1’s. Let S}” be the structural matrix for the three blocks of 
A, any two blocks from C which have 2 in row 3, and two blocks from E which 
have 1 in row 3. Then 


2 NO we — DO 


> eK DN eK = LO 


where the partitions separate the blocks from A, C, and £, in that order. The 
elements in rows 4, 5, and 6 and not in the main diagonal of S;” are uniquely 
determined by Lemmas 4.3, 4.4, and 4.5. 

Case 2. Row 3 of C contains either four or five 2’s. Let S;” be the structural 
matrix for the three blocks of A and any four blocks of C which contain 2 in 
row 3. Then 
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> — & 1D be 


a. 


where the partition separates the blocks from A and C. 

An easy computation shows that (iii) of Theorem 3.1, that is, the perfect 
integral square condition, does not rule out either Case 1 or Case 2. 

We shall however use the notion of ‘‘rational congruence”’ to prove the im- 
possibility of this design. Let a symmetric matrix A and a matrix B be non- 
singular matrices of order n with rational elements, and let z and y be column 
vectors with n variables each. Then if there exists a transformation x = By 
which carries the quadratic form f = 2’Az into the form g = y’B’ABy, we say 
that f and g are rationally congruent forms, and likewise we say that the matrices 
A and C = B’AB are rationally congruent matrices. 

Consider the Hasse invariant 


n—1 
(6.7) ey(f) = (— 1, —Da)y IT (Di, —Dissds, 
where p is a prime, D; is the leading principal minor determinant of order 7 in 
the coefficient matrix A of f, and (a, b), is Pall’s [14] generalization of the Hil- 
bert norm-residue symbol. For properties of this symbol, see, for example, [11]. 
Let i = the index of the form, and d = the square-free integer part of the deter- 
minant of the form. Then we have 

THEOREM 6.1. Two forms f and g are rationally congruent if and only if they 
have the same values for their invariants n, i, d, and c, for every p. These invariants 
are not independent of each other but satisfy certain relations which will not be 
stated here. For a proof of this important theorem consult the book by Jones [15]. 

Instead of considering the rational congruence of quadratic forms, we may 
consider the rational congruence of their coefficient matrices. Thus if f = 2’Az, 
we may write c,(A) instead of c,(f). 

Now consider C$” and C$”, which correspond respectively to S}” of (6.4) and 
S;” of (6.5). Multiply the last two rows and columns of C;” by —1. The result 
is C$”. Hence C3” is rationally congruent to C$”, and it follows that considera- 
tion of Case 1 only is sufficient. 

Let N, be the matrix of (3.9) which has the 7 blocks of S}” of (6.4) as chosen 
blocks. Now we may regard N,N; and J, the identity matrix of order b, as the 
coefficient matrices of quadratic forms. Since 


(6.8) NIUNINM(NTY = 1, 


r +f . + _- 
N,N, and J are rationally congruent. From (6.7), ¢,(7) = +1 for p odd. Hence 
it is necessary that 
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(6.9) cp(NiNi) = +1. 


We may evaluate c,(N,N}) for a general N, by a method similar to that in 
[11] or [13] and obtain 


c9(NiN) = (—1,7r — APO (p — yr) 


(6.10) : ; ; ; b—v—-1 

R (r on A, k)> (v, T)p (v, k)p (v, _iitteat A) p(— i, —Db)> Il (Deas; — Dvj41)p ’ 
3=0 

for p an odd prime. For the N, under consideration, 


(6.11) cp(NiN1) = (3, 2)9(7, 2)9(5, —1)p 
for p an odd prime, and for p = 3, 
(6.12) c:(NiNi) = —1. 


From (6.9) and (6.12), Theorem 6.1 is contradicted, and hence we obtain 
THEOREM 6.2. The balanced incomplete block design with parameters r = 8, 
k = 6, b = 28,v = 21, and = 2 is impossible. 
I wish to express my thanks to Professor R. C. Bose, under whose guidance 
this research was carried out. 
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1. Summary. When an infinite lot consisting of defective and nondefective 
items is investigated by means of a group sequential sampling plan, the use of 
matrices and vectors is helpful in determining the probabilities of various com- 
binations of the two classes of items and in computing unbiased estimates of the 
lot fraction defective. For a sequential plan of the Bartky [1] type, the infinite 
summation of such vectors leads to an exact, explicit formula for the average 
number of items inspected, 


flip = p- {L,[G(hi + he — 1) — (hi + he)] 
— Ge — 1) +h + he — [hil}, 


where p is the fraction defective in the lot, ZL, is the probability of arriving at a 
decision to accept the lot, h; and hz are parameters of the plan as defined by the 
Statistical Research Group [3], G(z) is defined by Bartky’s equation (36), and 
[h;] is the largest integer equal to or less than h, . 

In approximating L, , or in finding the parameters of a sequential plan with 
specified risks, the formulas proposed by Wald [2] and the Statistical Research 
Group can be improved by adding an adjustment, 


(1.1) 


(1.2) a = 3(1 — 2s), 


to the value of he wherever it occurs. Their formula for approximating 7, can be 
improved by adding the adjustment 


(1.3) cq = aq/(1 — s) 


wherever h2 occurs, provided that the value of L, which appears in this formula 
is arrived at by employing adjustment (1.2). 

2. Introduction. The sampling plans considered here are among those used in 
acceptance sampling where the purpose of the plan is to provide objective 
criteria for deciding whether the fraction defective in a lot of infinite size is 
excessive or not. Inspection of randomly selected items from the lot continues 
as long as 


(2.1) ns—-h<d<nst+ho, 


where n is the cumulative number of items inspected, d is the cumulative num- 
ber of defective items found, and s, h; , and he are positive numbers that are 
chosen to give the sampling plan certain desired properties and that may be 
regarded as parameters defining a particular plan of this type. When (2.1) is 
no longer true, inspection ceases and the indicated decision is recorded. If 


=) 
la 
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d < ns — h,, the decision is favorable, and the lot is accepted as not having an 
excessive number of defective items. If d > ns + he, the decision is unfavor- 
able, and the lot is rejected. The optimum properties of a sampling plan of this 
type, with respect to the amount of inspection necessary before a decision is 
reached, have been discussed by Wald and Wolfowitz [6]. 

If 1/s, hi/s, he/s, and hy + he are all positive integers, the plan is equivalent 
to a group plan like that outlined in Table 1, where an initial group of size 
vo = 0 is selected, followed if necessary by additional groups of size v = 1/s > ». 
The sampling plan for this case has the important practical advantage that 
reference to a chart or a table (like Table 1) is necessary only once for each 
group inspected. 

For group plans of this type, Bartky [1] derived an exact formula for the 
probability of acceptance, here denoted by L, , and for the average or expected 
value of the total number of groups that have been selected and wholly or 
partly inspected when inspection ceases and the appropriate decision is reached. 
These formulas were obtained by summing vectors, one vector for each value 
of n for which ns — h, is an integer, with an element for each of the hi + he — 1 
integral values of d satisfying (2.1), and with each element equal to the joint 
probability (a) that n or more items will be inspected before reaching a decision, 
and (b) that exactly d defective items will be found among the first n items 
inspected. By a slightly different approach, Girshick [5] derived a formula for 
L, that is equivalent to Bartky’s. His results indicate that this formula holds 
also when h; + he is not an integer. For the still more general case where s is 
any rational number between 0 and 1, Pélya [7] described a method of comput- 
ing L, and 7, by solving difference equations, 7, being used here to denote the 
average or expected number of items inspected until a decision is reached to 
accept or reject the lot. His results, however, are stated in terms of polynomials 
for which explicit formulas are not given except for an illustrative example. 
Walker [9] obtained an exact, explicit formula for ZL, for the case where s is 
rational, and found the mean and variance of the number of items inspected in 
terms of functions not stated explicitly. Wald [2] gave formulas for approximat- 
ing L, and “i, for the still more general case where s, h; , and hz are not neces- 
sarily rational. As pointed out by Mrs. Robinson [8], the errors in Wald’s 
approximations are sometimes of considerable size. 

For plans of the type outlined in Table 1, the approach taken here is to 
define a vector for every integer n > 0, including values of n where ns — h, is 
not an integer as well as the values discussed by Bartky. The method of com- 
puting these vectors is described in Section 3 following. As indicated in Section 
4, such vectors are useful in arriving at unbiased estimates of the lot fraction 
defective by the general method suggested by Girshick, Mosteller, and Savage 
[4]. They also simplify the summing of probabilities (or its description, at any 
rate), and facilitate the use of some results already obtained by Bartky. These 
results are briefly reviewed in Sections 5 and 6, and are used in Section 7 to 
derive an exact, explicit formula for %,. An approach similar to this can obvi- 
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ously be followed in connection with double sampling, truncated sequential 
sampling, and other plans for sampling attributes. Matrices with more than 
two dimensions may be employed where there are more than two attributes. 
Bartky suggested methods of approximating L, and the average number of 
groups selected. These methods can also be extended to the approximation of fi, . 
Such approximations are frequently much closer to the exact values than the 
approximations proposed by Wald. By adding certain adjustments to the 
parameter h, in Wald’s formulas, however, it is possible to obtain fairly simple 
approximations that are not greatly different from those resulting from the 
application of Bartky’s suggestions. These various approximations are discussed 
in Section 8, and comparisons between them and the exact values for illustrative 
examples are shown in the accompanying tables. The nature of the errors in 


TABLE 1 
A group sequential plan of the Bartky type 


Total number 


of groups Upon finding d defective items (d = 0, 1, ---) 
selected Total 


_ numberof = __ at 
_ jitems selected 


accept if dis equal continue inspection if d is equal to one reject if d is equal 
to or less than— of the numbers— toor greater than— 


Vo vos — hy vos — hy + 1, ---, dos 2 vos + he 
Uv +0 ms — hy +1 ms — hy + 2, ---, m8 2 vos + he 
vo + 2v ms — hy +2 ws — hy + 3,--- 

v + 3v ms —h, +3 


Norte: 1/s, hi/s, he/s, and h; + hz are positive integers; h, + hz: > 2; v 
vo = (hy — |hi])/s < v, where [h;] is the largest integer < h. 


Wald’s approximations is outlined in Section 9, and suggestions are offered for 
improving his procedure for choosing the value of he when designing a sequential 
sampling plan. 

The notation used here largely follows that introduced by the Statistical 
Research Group, Columbia University [3], where applicable. Elsewhere, Bartky’s 
symbols have been used extensively. In making close comparisons with his 
article [1], however, it should be noted that the definitions of some of his sym- 
bols have been changed slightly in order to simplify the summarization of his 
results. Some of this simplification is made possible by restricting our discussion 
to sampling plans where h; > 0, whereas Bartky also considered plans where 
h, < 0. 

3. The probability that inspection will continue. Let n, s, hi , and h2 have the 
same meaning as in inequality (2.1); and let d,; be the smallest integer greater 
than ns — h,. Fori = 2,3, --- , k, let 


(3.1) dj=d+i-1, 
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where k is the smallest integer equal to or greater than h; + hz. A vector P(n) 
can now be defined for every integer n > 0 by letting P;(n) denote the prob- 
ability that inspection will continue on account of finding exactly d; defective 
items among the first n items inspected. In other words, let P;(n) represent the 
joint probability (a) that n or more items will be inspected before arriving at a 
decision to accept or reject the lot, and (b) that exactly d; defective items will 
be found among the first n items inspected, with the special condition imposed 
that 


(3.2) Pin) = Oif d; > ns + ho. 


If we let p equal the fraction defective in the lot, and q equal 1 — p, we can 
now write 


(3.3) P(n + 1) = J(n)P(n), 
where P(0) is the vector with elements 


a (1 if 7 is the smallest integer > h, 
(3.4) P,(0) = ‘0 otherwise, 


fori = 1, 2, --- , k, and where J(n) is one of several k X k matrices with ele- 
ments equal to p, q, or 0. In particular, if 0 < s < 3, then 


(A if (n+1)s-h <d), and d, < (n+ l)s + he, 
(3.5) J(n) =4B if (n+1l)s—h>d, 
\C if m+1s—h<d, and d > (n+ 1)s+ ho, 


where the matrices A, B, and C have the respective elements 


f ; 
if j =i, 
(3.6) Ai; = 4 f# j=i-1, 
otherwise, 


if j=zi<k, 
Bi; = 4 if j=it+l, 
otherwise, 


(q if jgHi<k, 
(3.8) ‘3 =4p if jzt—-1<k—1l, 
0 otherwise. 


Examples of several successive vectors for an illustrative example are shown in 
Table 2. (If } < s < 1, the same approach may be followed after substituting 
1 — s for s and interchanging h; and hz , and p and q, with corresponding changes 
in the interpretation of the results.) 

For a group sampling plan of the type indicate’ in Table 1, where an initial 
group is selected of size vp such that 0 < vm < 1/s, followed by one or more 
additional groups, if necessary, of size v = 1/s, let ¢ represent the number of 
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groups of size v that have been completely inspected when the nth item is in- 
spected. Also, for n > v% , let 

(3.9) t=n— (uw +c), e=0,1,2,---, 
so that wherever defined, ¢ has one of the values 0, 1, --- , v — 1, representing 


the number of items inspected in the (c + 1)th group of size v. Then it can be 


TABLE 2 
Some values of P(n) for the illustrative case where s = .3,h, = .7, he = 1.6 





n | ns — hy dz d; ns + he P,(n) P,(n) P,(n) 





bo 


1 0 

q Pp 

¢ 2pq 

3pq 3p*q 
3pq 


Nore: P(1) = CP(0); P(2) = AP(1); P(3) = BP(2); P(4) = CP(@). 


or 


NonNnN = 


| “Im = OO 


1 
l 
1 
2 
2 


WwW bd bo 








verified that 
A‘'M‘P(0) if Vo 
(3.10) P(n) = 4 A"P(0) if 0 
A'MBA™"'P(O) if n 


0, 
n<u2l, 


< 
ot & I; 


where the superscripts indicate repeated multiplication, 
(3.11) M = BA™, 


and A° = M° = J, the identity matrix. Let C® be defined for any integer a and 
any nonnegative integer b by the relationship 


fo fa(b—a)!] if O<a<b, 
0 otherwise, 


(3.12) C3 = 


where 0! = I'(1) = 1. Then WV has the elements 
ae if 


[Cues i= i,2 -:-,b=-1L 
(3.13) My =4) ., oe 


being equivalent to M;; in Bartky’s [1] equation (7) except for an additional 
row and column. 

4. Estimating the fraction defective. Suppose a decision to accept or reject a 
particular lot has been reached after inspecting n, items and finding d, defective 
items, and that it is desired to obtain an unbiased estimate, p, of the fraction 
defective. Following the method of Girshick, Mosteller, and Savage [4], we may 
write 


(4.1) p = K*(n,,d,)/K(n,, dp), 
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where A(n,, dp) is the number of ways of selecting d, defective items and 
nN, — d, nondefective items without arriving at a decision to accept or reject 
the lot before inspecting the n,th item, and K*(n,, d,) is the number of such 
ways when the first item selected is defective. But since a decision has finally 
been reached after finding d, defective items among the n, items inspected, 
there must have been d, defective items among the first n, items inspected, 
where n, = n, — 1, and d, = d, or d, — 1 according to whether the lot has 
been accepted or rejected. Conversely, after finding d, defective items among 
the first n, items inspected, there was just one way of finding d, defective items 
among the n, items inspected. It follows that 


(4.2) p = K*(n, , d,)/K(n, , d,). 


Let P,(n,) denote the probability of finding exactly d, defective items among 
the first n, items inspected; and let P?(n,) denote the conditional probability of 
finding exactly d, defective items among the first n, items when the first item 
inspected is defective. Also, let d, be the smallest integer greater than n,s — h,, 
and r be the integer determined by substituting r for ¢ and d, for d; in (3.1). 
Then P,(n,) is the rth element in the vector P(n,) as defined in Section 3. Sim- 
ilarly, P*(n,) is the rth element in the vector P*(n,) defined by the equations 


(4.3) P*(1) = J*(0)P(0), 
(4.4) P*¥(n + 1) = J(n)P*(n), 


where J*(0) is the matrix obtained by substituting 1 for p and 0 for q in J(0) 
as defined in that section. Moreover, 


(4.5) P,(n,) = K(n,, dy)pq" 
and 

(4.6) P?(n,) = K*(n, , d,)p* "9". 
Hence, 

(4.7) pb = pP?(n,)/P-(n,), 


where the right-hand side of this equation is determined for any arbitrary value 
of p such that 0 < p < 1. 

For a sampling plan like that shown in Table 1, a convenient way of com- 
puting P?(n,) and P,(n,) is by first finding the operator O(n,) that is equivalent 
to the product of the matrices shown on the right-hand side of (3.10) forn = n,, 
and then employing the relationships 


(4.8) P(n,) = O(n,)P(0), 
(4.9) P*(n,) = O(n,)P*(0). 


The use of (4.9), however, will necessitate finding elements of a vector P*(0) 
satisfying equations (4.3) and (4.4) simultaneously for n = 0. It can be verified 
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that these two equations are satisfied if P*(0) has the following set of elements 
(not unique unless /(0) = A): 


( 
(4.10) P*(0) =<" 


~1)'* Highent nite 
where u is the smallest integer greater than h; — s. 

An alternative method of computing 7 is to find vectors K(n,) and K*(n,) by 
substituting 1 for every p and q in the vectors P(0) and P*(0) and in each 
matrix J(n), and then performing the operations analogous to those indicated 
in (3.10), (4.8), and (4.9). The rth elements in these vectors are equal to K(n, , d,) 
and K*(n,, d,), respectively; whence 


(4.11) p= K?(n,) /K,(n,). 
5. The probability of acceptance. For iv an integer, let 
(5.1) g(i) od - (—1)* eae 
d<i 


for d = 0, 1, --- , where Ci’ ”"**" is defined by (3.12). Also, for a sampling 
plan of the type shown in Table 1, let the vector V be defined by 


(5.2) V = > P(w + cv), c= 0,1,2,---; 
c=0 


that is, let V be the infinite sum of those vectors in equation (3.10) for which 
t = 0. Then from Bartky’s results [1], the elements of V are 


ig(t)g(he)/g(hi + he) if ‘<= k&. 
\g(i)g(he)/g(hi + he) — g(t — hy) if ‘>, 


(5.3) V3 = 
fori = 1, 2, ---,k, where k = h, + he; and the probability of accepting the 
lot is 

(Q°Vi 

\g'°Pi(0) + @°Vi 


In evaluating g(7) for small 7, it is convenient to use 


(5.4) Ly = g(he) gthy + he) = 


(5.5) g(t) = q Avis 

in conjunction with Girshick’s [5] difference equation 

(5.6) An = An-1 — pq” An; 

with the initial conditions 

(5.7) Ay = 1 ifm = 0,1, --- 


6. The average number of items selected. For the sampling plan indicated 
in Table 1, suppose the items are selected in groups of size v after the initial 
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group of size vp , the rth group being selected after the (r — 1)th group has been 
completely inspected. The total number of items selected, when a decision has 
been reached to accept or reject the lot, is therefore either 1 , or the sum of vo 
and some multiple of v. Let this number be denoted by the product vr, , so that 
r, is the sum of a nonnegative integer and the fraction v/v, where 0 < v/v < 1. 
From Bartky’s [1] equation (14), the average or expected value of r, is given by 


k 
(6.1) 7, = w/v+ 2 Vi, k = hy + he, 


where V; is defined by (5.3). Also, if we let 
(6.2) G@) = DgG - d), 
<i 
where g(i — d) is defined by (5.1), then 
k 
(6.3) D Vi = LG + ke — 1) — Ga — 0, 
=!) 


where L, is the probability of acceptance, as in equation (5.4). Since 1/s and 
h,/s are integers for the type of plan considered here, 


(6.4) vo/v = hy — [ha], 
where [h;] is the largest integer < h, . It follows that 
(6.5) Fp = L,G(h, + he — 1) — Ge — 1) + hy — fh), 


and that the average number of items selected is v7, . 

7. The average amount of inspection. If inspection ceases immediately when 
inequality (2.1) is no longer true, the average or expected number of items 
inspected for the sampling plan indicated in Table 1 may be derived as outlined 
in the following paragraphs. 

Let S; denote the ith element in the infinite sum of vectors 


(7.1) S= P(n). 


n= 
Employing (3.10) and (5.2), we write 


vo-l 


S= > A*P(O) + DA'V = (I — AD — A™) PO) + (I — A")VI). 
n=O t=0 
The elements of (J — A*°)P(O) are 0 if v9 = 0; otherwise they are (1 — q"°)P;(0) 
fori = 1, and P,(0) — P,_-:(v) for i = 2, 3, --- , k. Similarly, the elements of 
(I — A*)V are (1 — q°)V; fori = 1, and V; — Vi + Pi-s(vo) fori = 2, 3, 
-:» , k. The elements of (I — A)~ are equivalent to p~ for j < i, and 0 for 
j > i. The elements of S are therefore 
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( if t r , r f 
ip 4<(l —¢)Vi+ DIV; - Via + P;(0)]} 


(7.3) S. = ody — q”)P,0) + (1 — @)V; 
| 


\ 
+ DIV; — Vint Pio} if m> 0, 
j=2 
where the summation is taken to be 0 fori = 1. This equation may be reduced to 
(‘p"(V; — L,) if ish, 
pV. -Lp+1) ff f>h. 


(7.4) S; = 


To find the average amount of inspection, we can now employ 


Np = 2 Si, 


re 


the derivation of which is similar to that used by Bartky [1] to obtain his equa- 
tion (14). It follows that 


Np = p {> + ho a (hy + he) L,} 
= p {L,[G(h + he —- 1) - (hy ae he)} —_ G(he _— 1) a hy o he ‘one [hy]} 


(7.6) 


where 7, and G(7) are defined by equations (6.5) and (6.2). 

8. Approximation formulas. Several formulas have been proposed for approxi- 
mating the probability of acceptance and the average amount of inspection for 
sequential sampling plans. Such formulas are convenient not only for the type 
of plan shown in Table 1 where s, A; , and he are rational numbers, but particu- 
larly for plans where any or all of these parameters may not be rational and 
approximation by step-by-step evaluation of terms like those in the last three 
columns of Table 2 would be long and tedious. They are also useful in designing 
plans that are to have certain specified properties, as outlined in Section 9. 

The following formulas were proposed by Bartky [1]: 

(2vi + 2v/3 — 4/3)(v — 1) if vp 
(8.1) g(t) ~< 


1~1,,~t+1/9 


Hl = op) 4 [qa — (v — 1l)pz]) x if vp 
(vi? + 5vi/3 + v/18 — 4:/3 
— 1/18 — v"/9)(v — 1) 


Gi) ~ = , z ” 
il — vp) — 4v(v — 1)p(1 — vp)~ 


+ [q — (v — 1)pay"(1 — 2)" 
where v = 1/s and z is the real positive root of 


(8.3) (px + q)" = Zz 
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that is unequal to 1. In practical problems, we may first compute p for selected 
real positive values of the dummy variable z by employing 


(8.4) p = (c*° — 1)(x — 1)”, 


which is equivalent to (8.3), and then find the corresponding values of L, , the 
probability of acceptance, and 7, , the average amount of inspection, from equa- 


TABLE 3 


Computation of L,, the probability of acceptance, to three decimal places 
for illustrative examples 


2 


Example 1: 

s= 04,h; = 
Exact.... 
Formula (8.1) 
Formula (8.5) 
Formula (8.6)* 
Formula (8.8)....... 

Example 2: 

s= 04, h, = 2, 


Formula (8.1) | .950 
Formula (8.5) .962 
Formula (8.6 .901 
Formula (8. | .951 





Example 3: 

s= .04, h 
i. .996 .981 .888 
Formula (8. .995 .980 .887 
Formula (8.5 .996 .981 .891 
Formula (8. .991 .968 .857 
Formula (8.8)........... .996 .980 .888 


* Values shown for Formula (8.6) are taken from [3]. 


tions (5.4) and (7.6) in conjunction with (8.1) and (8.2). Comparison of the 
results with exact values for illustrative examples are shown in Tables 3 and 4 
on the lines headed ‘“‘Formula (8.1)’’ and “Formula (8.2)’’. In this connection, 
it should be observed that Bartky recommended these approximations only for 
cases corresponding to those where h, > 3. The examples illustrated here were 
deliberately chosen to show comparisons under more adverse conditions. 
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If v is large and p small, so that the probability of finding exactly d defective 
items in a group of size v approximates the corresponding term in the Poisson 


TABLE 4 
Computation of tip, the average amount of inspection, to one decimal place 
for illustrative examples 


10 5 2 





Example 1: 
s = .04, h 


Formula 
Formula 
Formula 
Formula 





Example 2: 

s = .04, h 
Exact 
Formula 
Formula 
Formula (8. 
Formula 


bo = 0 





Example 3: 

s= 04, h, 

Exact Sahm aecue ee cai 33. ; 53. 60. 58. 44.7 35.4 
Formula (8.2) (opiates ; 53. 60.5 57.8 44.5 35.0 
Formula (8.5) 5 aidan ac a 51. 58.9 56.7 43.9 34.7 
Formula (8.7) eee ee , 52. .6 37.4 29.5 
Formula (8.9) .| 33.6 40. 52.8 60.2 57. 43.5 33.9 


series, then g(z) is approximately equal to its limiting value as p > 0 andv— «, 
and we may write 


(8.5) gli) ~ & (a) ((d — iup}’e”, d= 0,1, 


d<i 


where 0! = I'(1) = 1. The limiting value of G(z) can be found by combining 
(8.5) with (6.2). These limiting values of g(7) and G(i), for 7 = 1, 2 


2 --- Senge 
for selected values of x as defined above, are given in Bartky’s Table II. The 
resulting approximations to L, and 7%, for illustrative examples are shown in 
our Tables 3 and 4 on the line headed ‘Formula (8.5)’’. 
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Wald’s [2] formulas for approximating L, and fi, have been transformed by 
the Statistical Research Group [3] into the following: 


(he/ (hi + he) = | 
= | tithe a a) /(gh be 1) : x § 
(hihe/[s(1 — 8)] - 
\[Lp(hi + he) — hal/(s — p) if p#s8, 


(8.6) L, 


(8.7) lip ~ 


where x has the same meaning as in (8.3) and (8.4) above. The results obtained 
by applying these formulas to the examples shown in Tables 3 and 4 are indi- 
cated on the lines headed “Formula (8.6)” and ‘Formula (8.7)”. These results 
are evidently less satisfactory than those obtained from Bartky’s formulas, 
particularly for small values of hz. For other comparisons, see Mrs. Robinson’s 
note [8]. 

For reasons outlined in Section 9 following, Wald’s formulas can be improved 
by adding appropriate adjustments to the parameter hz, so that the approxi- 
mations are computed from the following relationships: 


[ho + a)/(hy + he + a) 
(8.8) Ly 7 ( hy +ho+a hiy /(,Ai theta 
(a —2z')/(z - 


hi(he + b)/[s(1 — s)] 
(8.9) ti,>~ 


\[Lp(hi + he + eq) — (he + €q)]/(s — p) 


where 

(8.10) 3(1 — 2s), 

(8.11) all + s(hy + he + a)™'], 
(8.12) > = a/(1 — 8). 


For the illustrative examples shown in Tables 3 and 4, the application of these 
formulas leads to the approximate values shown on the lines headed “Formula 
(8.8)” and “Formula (8.9). The derivation of these semiempirical formulas is 
discussed in Section 9. 

Where sample items are selected as outlined in Section 6, the average number 
of items selected can be approximated by combining the foregoing approxima- 
tions with the equation 


(8.13) fy = (hy + he) Ly stad he + PNp ; 


obtained from (7.6), and then dividing by s. 

9. Some comments on Wald’s approximations. Wald’s [2] formulas for sequen- 
tial sampling were developed to provide an objective criterion for deciding which 
of two alternative hypotheses, H; and H,, concerning the population sampled 
is the correct one. In general, a sample statistic, X, , is observed or computed 





84 HOWARD L. JONES 

for n = 1, 2, --- , successively, where n is the cumulative number of observa- 
tions. Let f\(X,) and f2(X,) denote the relative probabilities that X, will be 
found in n sample observations when H, and Hz, respectively, are true. Also, 
let a denote the risk we are willing to run of making the wrong decision when 
H, is true, and 8 the corresponding risk when H, is true. Then for a Wald se- 


quential plan, sampling continues as long as the likelihood ratio satisfies the 
inequality 


l—a ~fi(X:) ~ @ 
When inequality (9.1) no longer holds, however, sampling ceases and we con- 
clude that H, is true if the second member of (9.1) is equal to or less than the 
first member, or that H, is true if the second member is equal to or greater 
than the third member. Wald’s choice of this type of plan was based on his 
conjecture, later proved [6], that it would minimize the average number of 
observations for the given risks when either H, or Hz is true. 

If the population consists of an infinite lot of items that can each be classified 
as defective or nondefective, if the alternative hypotheses state that the lot 
fraction defective p = p, and pz, respectively, and if X, = d (the cumulative 
number of defective items in the first n sample items observed), then inequality 
(9.1) becomes 


(9.1) 8 f(X.) -1—- a 


d n-d ‘ 
(9.2) B Se -i-8 


< 
l—a mar“ a 


This is equivalent to inequality (2.1), where 


: log [q:/q2) 
(9.3  ——— ) UE! gas 
- * = ics l@na)/@aol’ 


log [(1 — a)/s] 
9.4 ee ae 
(9.4) log [(p2q1)/(p1 92))’ 
and 


(9.5) _ log [(1 — 8)/a] 


log [(p2q1)/(P1q2)}’ 

“log” denoting logarithm to any convenient base. Formulas (9.3), (9.4), and 
(9.5), or their equivalents, were therefore proposed by Wald and by the Statis- 
tical Research Group [3] for use in designing a sampling plan with parameters 
s, h; and hz such that the operating characteristic curve representing the func- 
tional relationship between p, the fraction defective, and L, , the probability of 
deciding that H, is true, will pass through the specified points (p, , 1 — «) and 
(p2, 8). In practice, for p; < pe, a decision that H, is true means that the lot 
being sampled is accepted as satisfactory, while a decision that H2 is true means 
that the lot is rejected. 
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In a situation where the specified points, (pi, 1 — a) and (p., 8), and the 
values of s, h; , and he computed from formulas (9.3), (9.4), and (9.5) are such 
that the first two members of (2.1) are exactly equal whenever a decision is 
reached to accept the lot, and the second and third members are exactly equal 
whenever a decision is reached to reject the lot, the operating characteristic 
curve will actually pass through the specified points. Moreover, if we set 


(9.6) (peq)/(~rge) = 1/2, 


the elimination of p, and q, from (9.3) and (9.6), together, will yield an equation 
in pz equivalent to (8.4); while the elimination of p; , q: , and a from (9.4), (9.5), 
and (9.6) will yield a value of 8 equal to the right-hand side of (8.6). In other 
words, for the situation just described, equations (9.3) to (9.6) can be used to 
derive the exact formula for the operating characteristic curve in terms of x. As 
Pélya [7] pointed out, the necessary and sufficient conditions for this situation 
are that s = 3 and that both 2h, and 2h, be positive integers. 

In other situations, when a decision to accept the lot is reached, the first and 
second members of (2.1) will be equal if 1/s and h,/s are integers, and in no case 
will the first member exceed the second member by as much as s. But when de- 
cisions to reject are reached, the second and third members can not always be 
equal if s < 4, and the difference may be almost as large as 1 — s. It follows that 
when s is small, most of the difficulty with Wald’s procedure for choosing the 
parameters of the sampling plan lies in the formula for computing he. This 
difficulty can be largely surmounted by subtracting an adjustment, a, from the 
right-hand side of (9.5). This implies a similar adjustment of (8.6) by adding the 
correction a to each value of he , as shown in (8.8). The value of a here proposed 
is that shown in (8.10), which was chosen so that, for p = s, (8.8) would agree 
with the approximations to L, resulting from the combination of (5.4) and (8.1). 
The suggested formula for hz, therefore, is 


(9.7) . an log [((1 — 8)/a] 


— 3(1 — 2s). 


~ log [(p2q1)/(p1.92)] 
Comparison between actual and specified risks for illustrative sampling plans 
resulting from the use of formulas (9.5) and (9.7), in conjunction with (9.3) and 
(9.4), is shown in Table 5. 

To investigate formula (8.7), consider a situation where the points (p,; , 
1 — a) and (p2, 8) and the resulting values of s, h; , and hz lead to a sampling 
plan such that a decision to accept or to reject the lot can be reached only when 
the last item in some group has been inspected. In that case, the relationship 
between n, , the number of items inspected when a decision is reached, and r, , 
the equivalent number of groups of size 1/s selected, is necessarily r, = sn, . 
Taking expected values and combining with (7.6) to eliminate 7, leads to (8.7), 
which is therefore exact for the situation just described. This situation, as Pélya 
noted, is precisely the same as the one where (8.6) is exact; that is, the necessary 
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and sufficient conditions for (8.7) to be exact are that s = 4 and that both 2h, 
and 2he be integers. 

In other situations where 1/s and h;/s are integers, the relationship r, = 
sn, Will hold if the sampling inspection leads to a decision to accept the lot. 
But for s < 4, a decision to reject the lot may be reached before the r,th group is 


TABLE 5 
Comparison between actual and specified risks, for different methods of computing 
h, leading to the same sampling plan, where p; = .010720 and p, = .097766 








Example 1 Example 2 Example 3 





Specified risks: 
Formula (9.5) 
a .090909 .099099 .009009 
8 .090909 .009009 .099099 
Formula (9.7) 
a .044638 .048886 .004444 
B .095577 .009511 .099556 
Parameters of result- 
ing plan: 
8 ‘ .O4 04 
hy ‘ 2.00 .00 
he .00 1.00 2.00 
Actual risks: 
a .037 .041 .0044 
B .096 .0096 .0996 


completely inspected. In other words, for all plans of the type shown in Table 1, 
the relationship between r, and n, for every decision is 


(9.8) f —1< mm 75. 

This means that 

(9.9) Nip = [Lp(hi + he) — he — fpl/(s — p), 
where f, is some function of p that satisfies the conditions 
(9.10) 0o<f <1 

for all values of p, and f, — 0 as p — 0 or 1. The formula 
(9.11) fo ~ cq(l — L,), 


where c is defined by (8.12), results in approximations to 7, that satisfy these 
conditions, and that are about the same as the approximations resulting from 
the combination of (7.6) and (8.2) for values of p in the neighborhood of s. Re- 
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writing (9.9) and (9.11) as indicated in (8.9) shows that most of the difficulty in 
formula (8.7) is associated with the parameter he. 

Formulas (8.8) and (8.9) are equivalent to Wald’s formulas when s = 3, and 
are therefore exact for this value of s when 2h; and 2h, are integers. Again like 
Wald’s formulas, the approximations to L, and ”, approach the exact values 
as p — 0 or 1. Investigation of these formulas and their first derivatives for p = s 
indicates increasingly close agreement with the approximations computed from 
(8.1) and (8.2) as he increases in size. 
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AN APPLICATION OF INFORMATION THEORY TO MULTIVARIATE 
ANALYSIS 


By 8. Kuiipack 
The George Washington University 


0. Summary. The problem considered is that of finding the ‘“‘best’’ linear 
function for discriminating between two multivariate normal populations, 7, 
and 2 , without limitation to the case of equal covariance matrices. The “‘best”’ 
linear function is found by maximizing the divergence, J’(1, 2), between the 
distributions of the linear function. Comparison with the divergence, J(1, 2), 
between 7; and 2 offers a measure of the discriminating efficiency of the linear 
function, since J(1, 2) > J’(1, 2). The divergence, a special case of which is 
Mahalanobis’s Generalized Distance, is defined in terms of a measure of in- 
formation which is essentially that of Shannon and Wiener. Appropriate as- 
sumptions about 7; and 7 lead to discriminant analysis (Sections 4, 7), principal 
components (Section 5), and canonical correlations (Section 6). 


1. Introduction. The following extract from Section (4), ‘‘Scientific Method,” 
of Cherry [15] is pertinent. ‘‘. ..the idea of information has existed in early times 
and has gradually entered into a great variety of sciences, to a certain extent 
integrating them together. Nowadays the concept of information would seem 
to be essential to all research workers, and as universal and fundamental as 
the concepts of energy or entropy. Speaking most generally, every time we make 
any observation, or perform any ‘experiment’, we are seeking for information; 
the question thus arises: How much can we know from a particular set of ob- 
servations or experiments? The modern mathematical work, at which we have 
glanced, seeks to answer in precise terms this very question which, in its origin, 
is an epistemological one. But first a word of caution: the term ‘information’ 
has been used by different authors as having different meanings... .The informa- 
tion supplied by an experiment may perhaps be thought of as a ratio of a pos- 
teriort to the a priori probabilities (strictly, the logarithm of this ratio).” 

R. A. Fisher’s measure of information (intrinsic accuracy) was introduced 
to compare the merits of different estimates. Shannon and Wiener’s measure of 
information was introduced to define and measure that which is being conveyed 
by a communication system, the latter considered as a stochastic process. (See 
references in [9], [15].) 

The author and Leibler, in [9], generalized Shannon and Wiener’s definition 
to the abstract case and showed that it and Fisher’s definition are not unrelated. 
Properties of a measure of divergence between statistical populations, defined 
in terms of the measure of information, were also derived in [9]. 

Other approaches to a definition of the distance or divergence between two 
populations, and the applications of such a concept, have been made by Maha- 
lanobis [10], Bhattacharyya [16], [17], and Rao [18]. 
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The measure of divergence and its properties, as derived in [9], may be ap- 
plied in particular to the problem of discrimination between certain multi- 
variate normal populations by means of linear functions. We do not limit our- 
selves to the case of equal covariance matrices. Although no essentially new 
results are derived in the following, it is believed that the methods and under- 
lying uniformity of approach may be of pedagogical interest.’ 

Matrix notation, methods and results are used and assumed known to the 
reader. For the purpose of this paper we limit ourselves to a discussion using 
population parameters and do not consider problems of estimation or distribu- 
tion. Attention is invited to Bartlett [2], Brown [3], Cochran and Bliss [4], 
Kendall [8], Penrose [11], Smith [12], Tintner [13] and Wilks [14] for discussions 
of related problems and additional references to the literature. 


2. Divergence. 

(a) Definition. If two multivariate normal populations 7, and m2 have the 
respective probability densities f;(x: , x2 , --- , %),7 = 1, 2, then the divergence 
between 2; and z2 is defined by [9] 


(21) J(1,2) = [ Gan, oo, wa) — Sela «> +d log f(t +25 Za) dx,---dt,. 


fla, oe > Xx) 


The mean information for discrimination between 7 and 2 per observation 
from 7, is defined by [9] 


(1:2) = [ac Se 9) log a dx, +++ dry, 
; oe 


, P(m, | t1,-+*,2e) P 


where P(x;) and P(x; \ 21, +--+ , 2%) are respectively the a priori and a pos- 
teriori probabilities for x; , i = 1, 2, and a corresponding definition for 7(2:1). 


It is seen that J(1, 2) = 7(1:2) + J(2:1). 
If 


(2.3) Ya = Ya(tr »%2, 


are functions of the random variables 27; , 22, --- , 2% , such that the distribu- 
tion of the y’s is given by the probability density function 


(2.4) gi(Yr , Y25*** » Yr)s 


1 This approach has been found helpful in presenting certain aspects of multivariate 
analysis to a class at the George Washington University. 
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according as the x’s come from 7; or 72 , then the divergence between the popula- 
tion of y’s is defined by [9] 


J'(1, 2) 
(2.5) 
* va ay pA» "Yd a, 
[ ou, 1 Yr) — galyrs°** » Yr) <i, a dy, , 


and the mean information for discrimination between x, and zz per observation 
from gi(y:, -** , Yr) is defined by [9] 


veacraed 
2.6) (1:2) = / cool MY» -** 5 Yr ++ dy 
( ) I (1 ) nly, ’ Yr) log gay ies Yr) dy, dy > 


and a corresponding definition for /’(2:1). It is seen that J’(1, 2) = I’(1:2) + 
Fi). 

(b) Properties. The following properties of J, I’, J and J’ will be utilized. 
(For proofs see [9].) 

i I(1:2) > 0; J(1, 2) > 0, with equality if and only if fi = fe a.e.; 
li I, J are additive for independent random variables; 


ili [(1:2) > 1'(1:2); JU, 2) > J’(1, 2), with equality if and only if 


iv 





fila, eae » ee) = gly, Pare Yr) a 


.e., 
So(ai, +++ , Xe) gyi, °° Yr) 


in which case we say the functions y , y2,--- , y, are sufficient. The ratio 
J’(1, 2)/J(1, 2) is the discrimination efficiency of the y’s in the sense that N 
observations of the y’s will in the mean discriminate as well as n observations 
of the z’s where NJ’ = nJ. 

(ec) Particular cases. If we denote the one-column matrix of the means of 
population m; by uy) , 7 = 1, 2, and the matrix of variances and covariances of 
population x; by o() , 7 = 1, 2, then evaluating (2.1) and (2.2) leads respectively 
to 


oll =} 
5 tr[(oa) — o@) (0%) — o%)] 
ni =" 
+ $ (ua) _ May)’ (oa) + o(2))(uay = Hy), 


k 


1 a 
(28 + 5tr J) is) 
Z.3) 


I(1:2) = $ lo 


, —1 
+ 3 (ua = 2) o(2) (ua 7 Hey), 


where tr A is the trace (or spur) of the matrix A and the prime on a matrix 
denotes the transpose. 
If oa) = oa) = oa, then (2.7) becomes 


(2.9) J(1, 2) = (way — we@)’o (ua) — wey) = 80'S, 
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where 6 = ua) — ue), and the last member in (2.9) is k times Mahalanobis’s 
Geueralized Distance ({4] p. 162, [10]), and (2.8) becomes 
(2.10) I(1:2) = }8’o 8. 

If ua) = pe) , then (2.7) becomes 
J(1, : 4 tri[(ca) — oe) (or) — 0%) 
(2.1 1) 2 [ \ ) (1) ] 
= 3 tromoi), + } tr ome — k, 
which for the single variate case is 
(2.12) 
and (2.8) becomes 
. k ~1 
(2.13) I(1:2) = log OU —~stitr a) %2); 


+ Fa) | 


which for the single variate case is 


(2.14) I(1:2) = 4 log = +: 


3. Linear discriminant function. Let us consider the following problem: De- 
termine the values of the coefficients a, , --- , a, , such that for 


(3.1) Y= ar) + Ael2 + ++. ad Apr, 


the value of J’(1, 2) is a maximum, when 2, 22, °-- , 2, come from m and 
m2: . Depending on the assumptions regarding and 2 we are led to a number 
of now classical results. 


4. Equal covariance matrices. Assume that oa) = o) = ¢; since y in (3.1) 
is normally distributed, we have as the single variate case of (2.9) 
(4.1) J'(1, 2) = (E(yw) - E(y@))’ ‘a3 = (a’5)*/)a’ca), 


where a is the one-column matrix of the a; ,7 = 1, 2, --- , k, and 6 is defined as 
in (2.9). By selecting the a’s such that 


(4.2) ac = 6, a 


ef lee —l 
6a bd0 6 -} , 

= ——___ = f'g § = J(1,2), 
hatch e 


(4.3) J’(1, 2) 


so that with the a’s as given by (4.2), the linear function y of (3.1) is sufficient, 
and J’(1, 2) attains its maximum possible value (cf. [2], [3], [4], [5}). 
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5. Principal components. Let us assume that ua) = ui), in which case, as 
we have seen, J(1, 2) is given by (2.11). For the linear function (3.1) we then 
derive, in view of (2.12), 


(5.1) J, 2) ~ LSet 4 Leen 


5 7 é 
2 a’ ona 2 a’ awa 


To find the values of the a’s which will maximize (5.1), the usual calculus pro- 
cedures yield the result that the a’s must satisfy 


(5.2) wa = Ag 2a, 
where J is a root of the determinantal equation 
(5.3) | ga) — Avge | = 0, 


all roots of which are real and positive. Let these roots be \;, As, --- , re, 
arranged in ascending order. Corresponding to the root A; , and using (5.2), it 
is found that (5.1) may be written as 


(5.4) J'(1,2; 4) = 4A + 1, 


- 
2X; 
and that 

] ' 
(5.5) VI) =F Uw +4 LS —k = J(I,2), 
since 
' ~ 1 - 
(5.6) > = tr oq) Cis , 2 d: = tre) oy; 
also, using (2.13) and (2.14), we have 


(5.7) 1'(1:2;,) = —} log \; —3 + os 


. « 
(5.8)  I1(1:2) = —d log Ades ee — : +4>a; = YE 1'(1:2;d,). 
o t=] 


To determine the value for which (5.4) is a maximum, proceed as follows. 
Consider the function 


1 
(5.¢ = _ 
5.9) f(r) 4x + Dr a A> 0. 


By examining the derivatives of f(A), it is readily determined that f(A) is a 
minimum for \ = 1, is monotonically increasing for \ > 1, is monotonically 
decreasing for 0 < \ < 1, andf(A) = f(1/A). Thus, the maximum of (5.4) occurs 
for \; or \, according as 


(5.10) AA «4a or AAR » me * 
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and the best linear discriminant function (3.1) is the one for which the a’s 
respectively satisfy 
(5.11) oma =Aogma Or oma = Ajo~wa. 
To illustrate, let us take 

aie 1/ 12,496.8 —6,786.6 1 (136,972.6 58,549.0 
(5.12) og = = soles. a _ ’ ce * — ror - . , 

7 — 6,786.6 32,985.0 49 \ 58,549.0 71,496.1 


which are respectively the treatments and residual values of Table I, p. 177 of 
Bartlett [2]. Using the values of (5.12), we find as the roots of (5.3) 


(5.13) Ai = 0.44158, A2 = 6.38381, 

so that (5.4) yields 

(5.14) J’(1, 2; Ar) = .35309; J’(1, 2; Ax) = 2.27023; 
and from (5.5) we have that 


J(1, 2) = .35309 + 2.27023 = 2.62332. 


Since \;A2 > 1, the best linear discriminant function is that associated with )- 
(as is also evident from (5.14)), and (5.11) becomes 


5.16) 1( 124968 _— a; _ 6.38381 (136,972.6 58,549.0\/o1 
7 F\ —6,786.6  32,985.0)\ az ~ 49 58,549.0 71,496.1/\a,/’ 


or 


2 112418.la; + 60181.5a2 0, 
, 60181.5a, + 32217.302 0, 


(5.17) 
leading to 
(5.18) a, = —.535a2, 
that is, the linear function (see p. 179 of [2]) 
(5.19) y = 22 — 0.5352; 
is 86.5% efficient, since 


2.27023 


2.62332 


(5.20) J'(1, 25 2) /J(1,2) = = .865. 


Using the values in (5.13), we have from (5.8) 
(5.21) (1:2) = 1.89449, 
and since J(1, 2) = 7(1:2) + J(2:1) [9], 

(5.22) 1(2:1) = .72883. 
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For the linear function in (5.19) associated with A, , we have from (5.7) 


(5.23) I’(1:2; ds) = —} log \: —} + % = 1.76502, 
and since J’(1, 2; Ax) = 1'(1:2; Ax) + J’(2: 

(5.24) I'(2:1; X.) = 

These values are summarized in Table 1. 


TABLE 


A, = .44158 A2 = 6.38381 


I’(1:2) 12947 .76502 1.89449 = I(1:: 
T’(2:1) .22362 00521 72883 = I(2:1) 
J’(1,2) 35309 2.27023 2.62332 = J(1, 


From Table 1 it seems reasonable to infer that the linear function (5.19) is 
affected by the treatments. 
If now we assume in particular that 


(5.25) cq) = r, ¢@) = I, ’ 


where P is the matrix of population correlation coefficients, and J, is the identity 
matrix of order k, then 


J(,2) = }tr ((P — I.) — P™)) 
itr (P + P™ — 2;) 


(5.26) 
k 


] i 1 : Sis. .-qetee: er 
= (p —1) = 5 a ett 


1 


2 in! “ tml | —piine..-c—cst 


: il 
where p™ are the diagonal elements of P~, and pj,12...¢:-1)(¢41)---ks 
are the population multiple correlation coefficients, and 


I(1:2) = — log | P|. 
The best linear discriminant function (3.1) is that for which the a’s satisfy 
(5.28) Pa=da or Pa = a, 
according as \yA; < 1 or Ay, > 1, where \y < Ax < --- < »% are the roots of 
(5.29) |\P—-\rX,| = 0, 
all of which are real and positive. It is easily verified ({6], [14]) that 


X'POX wy. +--> + w/e, 


5.30) ea . : 
~ X= ge + :-. + —h, 
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where y; is the value of (3.1) corresponding to 


(5.31) Pa = ),a. 


For the bivariate case in particular, we have 


in a 
p-(} ‘ys 


1(1:2) = —4}log(1 — p*), J(1, 2) = p°/(1 — 9), 
P—-nd.| =’ —-2+1-/ =0, 
M=1l—p, A=1+p, p>dO, 

1 =1+,, A= 1—p,, p <0, 


2 


acd p 
J*(1,2;:) =§ —— + = = 
: , 2 2(1 — p) 


otha 2 ee 
2 2(1 + p)’ 


(5.38) J'(1,2; 2) = 
- \ 7, p p 

(5.389) J’(1,2; A.) + J’(1,2; 2) = ry = Tre tT. J(1, 2), 
(5.40) J’(1, 2; )/J (1, 2) = (1 + )/2, 

(5.41) J’(1, 2; 2)/J(1, 2) = (1 — )/2, 

(5.42) 1'(1:2; a) = —4 log(1 — p) —3 + 4(1 — o) = —4 log(l — p) — 40 
(5.43) I'(1:2; 2) = —4} log(1 + p) —3 + 3(1 + p) = —4$ log(1 + p) + 4p, 
(5.44) ya = (1 — 22)/V/2; y2 = (X1 + 22)/V2, 

(ti — 22)? | (ti + 22)? 


- 21—p) + 20+e)” 


co. (xy° — 2pr,x, + 227) = 
a 


Note that if p > 0, the best linear discriminant function corresponds to , ’ 
as is evident from (5.40), and also from the fact that 4A. = 1 — p* < 1. 


6. Canonical correlation. Let us assume that ua) = ua, and that 


, : Zu 2 =n 0 
(6.1) 1) = v . e.. , GQ) = 0 Ze ? 
“21 22 —22 


where 
Zu = (i), t, j = 1, 2, Po ki, 
Loe vee (Gre), io = ky + i, eee ky + ke = k, 


Li = (cis), Zn = Zi. 
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Since, as may be readily verified, 


I 0 i> ~~ a re 
6.2 a = 1 
an Phe Tp.J\ Za D2 0 Ty, 


we have 
, 


- 
“21 


(6.3) 


~ . ale nm + 
where 22.) = L211 212. Thus (cf. [1] p. 1 


J(1,2) =3 7 7 S bh (= 


mig ee! 


=1l «12 22.1 


g-l 


“221 <21 <11 


T(1:2) = 4 log . 


+ log 


If we write the linear function of (3.1) as 


(6.6) y= Biti + ees + Bx Tk, + Vite, +1 +--+ Vkolky+ks » 


then (5.2) may be written as 


=2\ /8 = 8 
(6.7) 


21 


2/ \Y 2 24 ¥ 


where 8 and y are respectively the one-column matrices of 3; , --- , 8, and 
Yi, °** »Yk., and (5.3) may be written as 


(Cl — A)En X12 
(6.8) 
Za (1 — A)Zn 
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Since (6.7) is equivalent to 
ZuB + Z2y 


2218 + Tey 


(6.9) 


Sn ry 
aaa 11 =12Y; 
(6.10) 


1 


| Sa Dit Sey + (1 — A)Sny = 0, 


we conclude that (6.8) is equivalent to 
(6.11) | Sali Su — p22 | = 0, 


where p = (1 — d)*. According to (5.3), the roots of (6.8) are all real and posi- 
tive. If we take k, < k, , then since k = k; + ke , and the determinant of (6.11) 
is of order ke , 

Ac = 1— 9p. , Ae as = 1 + peggi-s, * @ 1,2, °° he, 
(6.12) 


Neo +t = +++ = Ago+Gy—ke) = 1, 


where p; > p2 > --: > px, . We may also conclude that —1 < p; < 1 since 
the \’s cannot be negative. The p; are Hotelling’s canonical correlations [7]. 
The results of (5.4) now become 


1 


J’(1, 2; ;) = 4$(1 i) — 
(1, al + p + x +b 


1 = $pi/(1 + p,), 


1 Pk +14 
van SCL: Sa) @ 8) ~ ad > ae, oo | es 
(6.13) ' — _ = 2(1 — Pk2+1-i) 2(1 “= Pko+1—1) , 


J'(1, 2;\) =} +4-1=0, 


i, ce ke + (ky eer ke), 
or 
(6.14) J’(1, 23s) + J’(1, 23 esis) = pi/(1 — pd), p= 1,2,-+- he, 


and from (5.5) and (5.8) respectively 


ke 
(6.15) J(1, 2) = Dd pi/(1 — p)d. 


t= 1 
(6.16) I(1: 2) = —} log (1 — pi)(1 — pa) --+ (1 — pi,). 
Since 


Mdm = (L — p)(1 + pr) = 1— pi <1, 
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the best linear discriminant function (6.6) corresponds to the value \; , or to 
the largest canonical correlation. 

If we pose the problem of finding the best pair of linear discriminant func- 
tions 

eae [mM = Biti t +++ + Be, T,, 
(6.17) 4 

\Y = Yi Tky 41 + ~ ay + Vio Tkytkes 
then we want to maximize (see (5.33)) 
2 af \2 
a . (8B Sy) 
(6.18) J, 25) = = —__ Ba 
1— py  (82u8)(y 2x27) — (8 227) 


The usual methods lead us again to the condition (6.9), where (1 — \)’ = pj. 
We thus see that the canonical correlation coefficients are the values of p,, , 
and 


(6.19) J’(1, 23 ws, vi) = J’(1, 2544) + J’(1, 253 Anga-s), a= 1,2,--:, ke. 


The best pair of linear discriminant functions thus corresponds to pj, that is, 
the largest of the canonical correlations. 
To illustrate, let us take as oq) the matrix 


(1.0000 


.0655 


which is Kelley’s data discussed on p. 342 of [7]. As the roots of (6.8) we find 

ef. Ex. 28.4, p. 351, of [8}) 

6.21) A», = 0.6055, Ae = 0.9312, A; = 1.0688, A, = 1.3945, 
(1—.6055) (1.3945 —1) = .1556, 


(6.22) 


(1—.9312)(1.0688 —i) = .0047, 
2:21) = .1285, J'(1, 2:2) = .0025, 


2; Az) = .0022, > Aa) = 0558, 


pi 1556 ; 0047 
ian 0 sae om SOE oo ae 
1— pj .8444 —p: .9953 


= .0047, 


J(1, 2) = .1843 + .0047 = .1890. 


The linear function associated with A, is 68.0% efficient. The pair of linear 
functions (6.17) related to the correlation pj = .1556 (see [7], [8] loc.cit.), 
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My —2.7772x + 2.265522 9 


V1 — 2.44042; + May 


(6.26) 
are 97.5% efficient (and thus practically sufficient) since 


= 975. 


e , . ) IRS ; , 9. ) ‘ 
(27) 225%) _ 1285 ggg FC 5 mum) _ 1843 © 
J(1, 2) 1890 [890 


J(1, 2) 

Using the values in (6.22), we have from (6.16) 

(6.28) I(1:2) = —}$ log (.8444)(.9953) = .0869, 

and therefore 

(6.29) I(2:1) = .1890 —.0869 = .1021. 

Similarly 

(6.30) I'(1:2; wu, m) = —4} log(.8444) = .0846, 

(6.31) I’(2:1; wu: , 1) = .1843 —.0846 = .0997. 
These values are summarized in Table 2. 


TABLE 2 


ML, ML M2, v2 


I'(1:2) .0846 .0023 .0869 = I(1:2) 
I’(2:1) .0997 .0024 1021 = I(2:1) 
J'(1, 2) .1843 .0047 .1890 = J(1, 2) 


From Table 2 it seems reasonable to infer that the linear functions (6.26) 
are the only such components. (See p. 342 [7].) 


7. Discriminant functions with covariance. Assume that 


, v 
ies “1 <12 
(7.1) oa) = 72) = , 

ys v 

“2 —2 


9, Da , Lx as defined in (6.1). Let the one-row matrix of means be 


/ , 
(uc » Ha), 


where the u’s are the means of the first k, variables, and the v’s the means of 
the last k. variables. Assume that vq) = v@ . Then 


5 s — 
_, . fn <2 
(7.3) J(1, 2) = (8, 0( ‘ 
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where 6 is defined as in (2.9). Using the value of the inverse matrix as given 
in (6.3) it follows that 
(7.4) J(1, 2) = #36 + FT TT ZnZN 5. 
Let 
(7.5) ¥i=%, 
then 
(7.6) J'(1, 2) = & S716. 
The gain due to the use of the covariance variates 2,4: , *** , Ze,4%, in the 
linear discriminant function is thus given by (see [4]) 
(7.7) J(1, 2)/J’(1, 2) = 1+, 
where 
(7.8) p= PFnZe Fanta Za? 
rné 
and \ will take on a value between the smallest and largest root of the deter- 
minantal equation 
(7.9) | Dn lwFmlaln — AL | = 0. 


Indeed, since the quadratic form in the denominator of \ is positive definite, 
there exists a real nonsingular transformation 


(7.10) 6 = Ay, 

such that 

_ Mavi + doya + oes + es Ys 
where A; , A2, *** , Ax, , are the roots of (7.9), or 


(7.12) 


(7.11) r 


? 


= 
| DieLo2122 — AD | = O. 


If ki > ke, then since 


Xs sv. | hs. — vi stl gs 
~11 —12 ~i1 —12 22.1 21 


| Xo. Dae.2} 0 


3 ly 
— > 221-11 -12)5 


' 


it follows that 


(7.14) S12 Den. Sa — ALDn = O = Loar — r In Di Zi}. 
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Since E21 = Zea — EYP 2.2, the right member of (7.14) reduces to 
(7.15) | Salil — p 22 | = 0, 
where p = X/(1 + A). Comparison with (6.11) shows that the roots of (7.15) 
are the canonical correlations. 

In the case kz = 1, there is only one canonical correlation and it is the multi- 
ple correlation of z, on 2; , %2, °*+ , 2-1 , so that 


J(1, 2) 


Sid... 1 
Ste. eB hb ht.) + eh ...2 
J tL, 2 +ASI1+ 


1 — pi-w--¢—y 1 — pen---a-n 


(7.16) 


Thus, using the values for the Rabbits XK doses components in Table 2 p. 
157 of Cochran and Bliss [4], we have for the matrix (7.1), 
/3223 1200 { 1259 

1200 3137 


(7.17) = 


and 
1259 1200 | 322 
| + 1340 | | 
| 1340 _ 1373 | |1200 1340 
3223 1200 | 
1200 3137 | 
774 _ 
2351 


1 
2 J/J' < —sz = 1.50. 
(7.20) WIS S 7 rwRB 


On p. 162 of [4] it was concluded that the use of covariance gives 50% more 
information. 

Solving for the’ coefficients of the discriminant function in the equations 
(see [4], p. 157) 


3223a;, + 1200a2 + 1259a; = —1197.2 X 33, 
(7.21) 1200a; + 3137a2 + 13400; = —844.3 X 33, 
1259a, + 13400. + 235la; = 0, 


1259 


(7.18) 





= 774, 


(7.19) Pi. = 33, 


it is found that (see (4.2), (4.3)) 
(7.22) 33J = 1197.2 & .41848 + 844.3 X .27070 = 729.556. 
If we solve (omitting the covariance variable) 

32238: + 12008. = —1197.2 K 33 


7.23 
: ' 12008, + 31376, = —844.3 X 33 
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for the coefficients of the linear discriminant function, it is found that (see 
(4.2), (4.3)) 


(7.24) 33J' = 1197.2 X .31629 + 844.3 X .14815 = 503.845, 


and 


(7.25) 


8. Conclusion. It is seen that the multivariate analysis techniques of dis- 
criminant analysis, principal components and canonical correlations are indeed 
closely related concepts associated with a linear discriminant function, and 
differing primarily in the assumption about the underlying populations. 
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CORRECTIONS FOR NONNORMALITY IN THE USE OF THE 
TWO-SAMPLE t- AND F-TESTS AT HIGH SIGNIFICANCE 
LEVELS! 


By Ratpo ALLAN BRADLEY 


University of North Carolina and Virginia Polytechnic Institute 


Summary. Correction factors to the probabilities that the two-sample ¢ and 
F statistics shall exceed fixed positive values & and F, either numerically or 
arithmetically and to the probabilities that ¢ shall be exceeded by fixed negative 
values t& have been derived geometrically. The derivations, when the popula- 
tion has a normal density function, produce a novel method of obtaining the 
usual distributions of ¢t and of F. The correction factors permit the use of exist- 
ing tables and the results are asymptotically correct for numerically large 
values of the test statistics. 

There is some indication that the correction factors are better for small 
sample sizes than large ones. Useful bounds on the errors committed by using 
the asymptotic corrections at usual significance levels have not been obtained 
and are a subject for future investigation. 

For difficult cases a method of approximate evaluation of the correction fac- 
tors is provided. 

1. Introduction. Many papers have been written on the subject of nonnor- 
mality and its effect on common tests of significance. The methods employed 
may be classified in six categories as follows: (i) transformation of the observed 
variate; (ii) tests based on new statistics adapted to the specific underlying 
populations; (iii) rank order methods and combinatorial tests; (iv) approximate 
tests based on limiting distributions of the statistics commonly used; (v) ex- 
perimental sampling; and (vi) exact distributions of the common statistics for 
a nonnormal observed variate. Discussions which are typical of the above 
classifications are given in references [1] to [10]. 

Exact distributions of the statistic t, formulated by “Student” [11] and modi- 
fied and extended by R. A. Fisher [12], [13], have been found in certain trivial 
cases for nonnormal populations. In addition, M. S. Bartlett [14] and R. C. 
Geary [15] have attempted to investigate errors in the significance levels of t¢ 
by considering the first few terms of a Gram-Charlier series. Their results de- 
pend on a knowledge of measures of skewness and on neglecting powers of the 
sample size smaller than its inverse and powers of the skewness of higher order 
than the first. A somewhat better approximation to the distribution of ¢ has 
been presented by A. K. Gayen [23], who used the Edgeworth series, but it was 
assumed that population cumulants were known. B. H. Camp [16] has recently 


1 Research supported in part by scholarships awarded by the National Research Coun- 
cil of Canada and by the Ontario Research Foundation. 
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studied the effects of slight changes in the parent density functions on the dis- 
tribution of statistics and he puts bounds on the resultant changes in the dis- 
tributions of the statistics. These bounds are not close for large samples. 

The importance of investigating the effects of nonnormality is indicated 
by the vast amount of literature available on this subject, and, in spite of the 
paucity of practical results obtained from theoretical considerations, this never- 
theless seems the desirable approach to the problem. In the present paper, a 
method of Harold Hotelling [17], [18] used to obtain a correction to the dis- 
tribution of the “Student” ¢, valid at high significance levels, is extended to the 
two-sample ¢ statistic and to the F statistic for testing hypotheses on the means 
of a number of groups of observations. A second paper [24] shows series expan- 
sions of the distribution functions of ¢ and of F. 


2. A correction to the distribution of the two-sample ¢ statistic for nonnor- 
mality. The usual method of deriving the distribution of the ¢ statistic for tests 
of hypotheses on the means of two normal populations, assumed to have equal 
variances, consists of reducing the joint sample density to a form similar to the 
one obtained in the distribution of the statistic for a single sample. This similar- 
ity may no longer exist for nonnormal populations, and we shall proceed by 
geometrical considerations. 

The joint sample density for two sets of independent observations, 
ti, °** , ty, and y,--- , yw, may be written 


(1) P(S) = [I flee) IL flue) 


a=| Sum 


under the assumptions of independence and equal density functions of the 
form f(u). We reserve a and 6 to have the above values in this section and, in 
addition, define 


(2) n= N, + N2 — 2. 


The sample determines a point S with coordinates (7; , --- , wy, , Yr, °°" 5 Yno) 
in a Euclidean space 2 of (n + 2),dimensions and a line OS when the origin is 
designated by QO. A fixed line OA is taken with direction cosines of which the 
first N, are NIN Yn + 2) and the remaining N» are —NINzH(n 2). This 
line is at once seen to lie in a hyperplane Q, of (n + 1) dimensions given by 


(3) > Xa + > Ys =0 


If S, is the projection of S on Q, , and, if OS, is projected on OA in Q, , then 
it is easy to show that 


t ‘nN No} *(n + 2)%(% - a) {> — Zz) + > (ys - g)°}74 
(4) 
ni cot 6, 
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where Z and g are the means of the two samples and @ is the angle between OS, 
and OA. The definition of t is of course the usual one. 

It remains to find the induced density of S; in Q; which may be effected by an 
orthogonal rotation of axes. Let T be any orthogonal matrix whose first and 
second rows are the row matrices 


(n + 2)7[NTINE, «+, NINE, —NING, «+», — NING! 


and (n + 2) [1,---,1,1,--:, 1] respectively. The division in each of these 
vectors comes after the N,th element. The transformation to new variables 


21, °** , 2n42 18 defined by 
= ry 
Z’ =Tl,, |= I'Z’, 
} tly } ’ I 
where X’, Y’, Z’ represent column vectors of the indicated variables and I’ 
is the transpose of I. 2; is now simply defined by the equation, z, = 0. The 
induced density of S; is 


(6) P'(s) = [ P(S) dz, 


provided this integral exists and where, in view of (5), P(S) is considered to 
be a function of the new variables. 

If A be the point on the line OA for which the coordinates in the original 
system are a multiple w,; of the direction cosines of OA, then in terms of the 
new variables, A has coordinates (w; , 0, --- , 0). All points Q in Q having Z- 
coordinates, (w,, we,0,--- ,0) project into A. Thence, by means of the 
inverse transformation (5), the density of Q is ; 


" aol) N;'N! a> y _ Mi wie my. 
” —s= ' ( (n+ 2 9) io ( ~ (n+2) J 


Then, 


(8) P(A) = [ P(Q) dw». 


In a similar manner, the induced density of the point A’, diametrically opposed 
to A, is 


(9) p(s) = [| PQ) dws, 


where P(Q’) is equivalent to P(Q) with w; and we. replaced by —w; and —w». 

Let all density in Q, be projected orthogonally onto the surface of the (n + 1)- 
dimensional sphere with unit radius about O. We shall designate the projec- 
tions of S,, A, A’ by =, A* and A** respectively. This projection implicitly 
involves a transformation to generalized spherical coordinates. We may restrict 
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w, to be positive and the induced densities on the surface of the sphere for A* 
and A** are 


(10) pas) = [ ° [ ” wt P(Q) dw: dws 


and 
(11) P"(A**) = ? w; P(Q’) dw; dwe. 


The induced density of = may be written as 


(12) P'(z) = I o"P'(S,) do, 


if we think of P’(S,) as a function of n angles. 

Now ¢ is constant for all points S, for which OS, makes a constant angle with 
OA. The locus of such points will be the surface of an infinite right spherical 
cone with axis OA and generated by the line OS, . The locus of = is the surface 
of a sphere of n dimensions with radius sin@ on the surface of the unit hyper- 
sphere of (n + 1) dimensions in Q,. The differential element of probability 
for the variate @ is obtained from a consideration of the induced probability 
of the n-dimensional region of width d@ about the locus of =. In order to get 
an exact expression for this probability, it would be necessary to integrate 
P” (=) over this surface. However, for values of @ close to 0 or to x (or for large 
absolute values of ¢) and subject to the existence and continuity of P”(=), we 
expect P”(A*) to be approximately equal to D(Z) for values of @ close to 0 and 
P”(A**) to be approximately equal to D(=) for @ close to 7, when D(z) is used 
to denote the average value of P”(=) as Z moves over its locus of constancy for 
t. We shall use this approximation. 

The area of the n-dimensional region of width dé about the locus of = is given 
[17] by 


(13) S,-1 sin” "6 dé, 


where S,_; is the (n — 1)-dimensional area of a unit hypersphere intrinsically 
of (n — 1) dimensions in Euclidean space of n dimensions. It is known that 
S,1 = 2x'"/T(4n). The approximate element of probability for @ may be 
written 


(14) S,1P” (A*) sin" 6 dé 
for small values of @, and 
(15) S,-1P”(A**) sin”6 dé 


for large values of 6. The relation between ¢ and @, (4), is used to write the 
approximate element of probability of t, for large values of f, as 


(16) n34S,_:P"(A*)(1 } P/ny ernr dt. 
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For numerically large negative values of t, P”(A*) is replaced by P”(A**). 
The distribution of | ¢| takes, for large values, the approximate form, 
(17) n*S,-s[P”(A*) + P”(A**)](1 + @/n)- "7"? d|t |. 
In the special case where f(u) is normal, 
(18) P"(A*) = P"(A**) = 4F(4[n + 1]/x"*”” = 1/S,, 
and 
(19) P"(A*) + P*(A**) = 2/8, . 


Here, from spherical symmetry of the density in Q, the expressions (16) and (17) 
are exact and we have the familar “‘Student”’ distribution of t. When the assumed 
distributions are not ‘normal, the true probability of obtaining a value of t 
greater than a specified one differs from that based on the “‘Student”’ distribu- 
tion by a factor approximating 


(20) S,P”(A*) 


when ¢ is large. To obtain the true probability that | ¢ | shall be exceeded, the 
probabilities in the usual tables must be multiplied by a factor approximating 


(21) 4S8,[P”(A*) + P”(A**)]. 
In summary, from (10) and (11) with (8) and (9), 


2 pw . ( Ni? Nn W + =) Ny 
” *) _ . « unitinenti > 
mw Lf i nm+2e /, 


NiNzw, — w.\\** 
40(- Py ana, 


while P”(A**) is identical to P”(A*) when w; and uw» are replaced by —w,; and 
—w» in the arguments of f. These two similar double integrals may be combined 
to write the sum, P”(A*) + P”(A**), as one double integral. 


(22) 


3. A generalization to the F statistic. The F statistic used in the analysis of 
variance to test the hypothesis that k sets of independent observations come 
from populations with equal means, under the assumptions that the parent 
populations are normal with equal variances, is an adaptation of the statistic 
z discussed by R. A. Fisher [12]. Fisher obtained the probability density func- 
tion of z from a comparison of the distributions of two independent estimates 
of variance with the distribution of chi-square. The method we use here may 
be considered to be an extension of the geometrical derivation of the preceding 
section. Since the two-sample ¢ statistic has been discussed in detail, we shall 
outline the generalization to this case.’ 


2 A complete discussion is available in the unpublished dissertation [19]. 
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Consider groups of observations as exemplified in the ith of k groups by 
Yi, *** , £y,¢. Under the null hypothesis each z,; is taken from a population 
with zero mean (no loss of generality results from specifying that the common 
true mean be zero), and we shall assume that the population densities are all 
of the same form, f(u). A test of the hypothesis that the population means are 
equal uses the statistic, 


os 5 Pie ae \ 
(23) F=<\ DNA -— **/k -—1>/4 D0 dD Cai — %)*/n), 
i= 1 ; tl a=! ) 
where we define VN = >°iN,, n = N — k, and 2; is the mean of the ith group. 
& is the grand mean. 
The observations in order from x to ry,, determine a point S in an N-di- 
mensional space 2. The probability density of S is 


k ONG 
(24) P(S) = II [1 f@.). 
i-1 a=! 
The line OS is obtained by joining S to the origin. A hyperplane Q of (V — 1) 
dimensions is given by the equation, 
k NG 


(25) 2, 2 Xa 6 


i=l a=! 


and a subspace ©, of n dimensions in Q, is determined by (25) and (k — 1) equa- 
tions, 


j Ng N 1 
(26) oe A oe ae | j=l,---,k—2D). 
l 


i=l a= a=! 


The (k — 1)-dimensional space orthogonal to Q, in Q, is designated by Q, . 
If S, is the projection of S on Q; and if @ is the angle between OS, and the 
flat space 2, , we can show that 


(27) F = ncot’6/(k — 1). 


This is most easily seen by an orthogonal transformation to new variables, 
Zi,°*:, Ze, Y1,°-+,Y¥,. Let T be any orthogonal matrix whose first k 
rows consist of the orthogonalized row vectors of the coefficients of equations 
(26) and (25) in that order. Such a transformation is defined by 


08 Z Jere, x =r[2] 


Let the new coordinates of S be (2:, +--+ , 2, Yi, °** » Yn). & is defined by 
Z, = O and the additional equations defining 2, are Z; = 0,7 = 1, --- ,(k— 1). 
Then the cotangent of @ is obtained by taking the ratio of the distance of the 
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projection of S; on Q, from O to the distance of S, from 2, . The induced prob- 
ability density of S, in Q, is 


(29) P'S) = [” P(S) dee, 


where P(S) is considered to be a function of the new variables. 

In order to complete the derivation of the element of probability for F, it 
is now only necessary to observe the analogy between this situation, where our 
statistic is a simple function of the angle between a random line and a fixed 
flat space, and that demonstrated by Hotelling [20] in the derivation of the 
generalized “Student” ratio, where the statistic was a simple function of the 
angle between a fixed line and a random flat space. When f(u) is normal, we 
have spherical symmetry for the induced density in Q, . Following Hotelling’s 
procedure, we obtain the familiar element of probability of F, 


(30) (k — 1)%?"n""{e(a[k — 1), 4n)} CFO Ok — 1)F + nj’ aP. 


We must now consider how to correct the distribution when f(u) is not normal. 

Since F is a function of an angle alone, it is unchanged if S, is replaced by 
any point on the line OS;. We may then project all density in Q onto 
the (N — 2)-dimensional surface of the unit hypersphere C; about the origin in 
the (N — 1)-dimensional space Q, . Let the projection of S; be S*. F remains 
unchanged for all points S* on C, where OS* makes an angle @ or —6@ with Q, . 
If the locus of S* is C*, C* is the totality of points on two hyperspheres of in- 
trinsically (k — 2) dimensions on C; and parallel to and on either side of Q, at 
a fixed distance sin@. A third (k — 2)-dimensional hypersphere on C, is obtained 
from the intersection of 2, and C; . Let us designate it by C, . When @ is small 
numerically, C, is very close to C* and if the density of S* exists and is continu- 
ous over C; , we may replace the average density of S* on C* by the average 
density of points on C, . We shall make this approximation when f(u) is not nor- 
mal and it is exact when f(u) is normal. It is then sufficient to correct the dis- 
tribution of F by multiplying by a correction factor obtained by taking the 
ratio of the total density of C, when f(u) is not normal to the same total density 
when f(u) is normal. 

The induced density of S* on C, is obtained by transforming to generalized 
latitude-longitude parameters’ consisting of (V — 2) angles and a radius vec- 
tor p. This may be done so that the first (k — 2) angles are in Q, and the re- 
maining n angles are in Q,. @ is the angle connecting Z,,--- , Zi-1 
with Y,, --- , Y,. The induced density of S* is obtained by multiplying (29) 
by p*~” and integrating with respect to p over its complete range. In the special 
case where @ = 0, p = (>."7'z3)' and the induced density of S* depends only on p 
and the (k — 2) angles in Q, . To find the total density over C, , we must multiply 
the induced density of S* on C,, by the element of area over C, and integrate over 


This transformation is given in [20]. 
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the complete region. But this permits us to transform back to rectangular co- 
ordinates Z,; , --- , Z;1, and in so doing we have a Jacobian which includes 
the factor p “~”. Then, from the value of p given above, the total density on 
C, may be obtained by multiplying (29) by (>07'z7)"” and integrating with re- 
spect to the variables, z; , --- , 2-1 over their complete ranges. We may, upon 
designating this total density by D(S* | Q,), write 


2 wo /k—1 n/2 k—1 
(31) D(S*|2,) = fo ; (= 2) P’(S, | Q,) II de, 


where P’(S; | Q,) is the general expression for P’(S;) when S, is in Q, . Since 
S; is in Q, , P’(S; | 2,) depends only on the variables z; , «++ , z:-; . From (29) 


and the inverse form of (28) it follows that 


o k ( 
P’(S;|2,) = / Il E —(Ni + +++  Nya)*(N ++ Ns) ONG 85-5 
(32) an 
ome j 


k—1 \ Nj 
+ > NbN + e+ + Na) Ni + ++ + Nog) te + va) | dz, 


where, when j = 1, the first term in the linear function of z’s in f vanishes. When 
f(u) is normal, 


(33) D(S* | 2,) = Sr-1/Sy-2, 


where S,; = 22*?/T(k/2) and Sy_2 has the similar form. This evaluation is 
easy when we remember that the transformation (28) is orthogonal. 

We may now state that, when the assumed population is not normal, the true 
probability of obtaining a value of F greater than a specified one differs from the 
usual tabular value by a factor approximating 


(34) D(S* Q,)Sw-2 Sr-2 


for large values of F. 
In the special case in which k = 2, and since So = 2 by definition, (34) re- 
duces to (21) as it should. 


4. Approximate evaluation of the correction factors. In some cases it may be 
difficult to evaluate the integrals P”(A*), P”(A**), and D(S* | Q,). One method 
by which these integrals may be approximated is that of steepest descents’, 
which will now be demonstrated. P”(A*) will be considered in some detail and 
the results for P”(A**) and D(S*|Q,) are stated and have been obtained in 
the same way. 

In the form (22) for P”(A*), replace n by its value, N; + Ne — 2, and fix the 
ratio, Ni/Ne, by writing N. = NN,. If we set u = w,/(N; + N-)? andv = 

‘ The first discussion of this method was given by Laplace [21]. One may also consult 
the writings of R. H. Fowler [22] 
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w2/(Ni + N;)', 


PP’ = (N, — nyeeron [ I un 


AIMN, 
‘ | a fo + »v) {i(- : = me} | du dv. 
When we define 


(36) g(u, v) = lone | o™ fu + v) {1(- —. ey | 


o ao 
(37) P”(A*) = (N, + N,) nn I u~ exp {Ni g(u, v)} du de. 


(35) 


Let us restrict f to be such that g has a true maximum within the ranges of 
u and v and suppose g assumes that maximum at (%&, vu). We approximate 
g(u, v) by a three-term Taylor expansion about (uw , vo) and u* by a two-term 
expansion about uw. Then 


/ 


2 
g(u, v) = glue, »m) — 44 —(u=— ua)? 29 
(38) 


—2(u — w)(v — w) oe) —(v- »)? 29 | \, 


du? |o 


dudv dv? |o 


and 
(39) u* = up {1 — Au — Uo)uio'}. 


The second term of the expansion of g(u, v) has vanished on the imposition of 
maximizing conditions of the first order. If we take G as the matrix of the quad- 
ratic form in (u — uw) and (v — v9) in (38), G is positive definite from the second- 
order maximizing conditions. 

Since G is positive definite, T exists such that G = II’ and|T| = |G |. 
We define new variables by the transformation 


e] [VN (u — w) 
(40) = | 

” V N,i (v oe 7) 
The ranges of — and » are taken to be from —« to ~, since the range 
of N;(u — wu) is from —Nyw% to ~, and is taken as — © to ~ for N, large, 
and since+/Ni(v — vo) has that range already. The Jacobian of the transforma- 


tion is 


(41) N7'|G|7. 


By substituting the approximations (38) and (39) in (37) and transforming 
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to the new variables defined in (40), it is possible to write 


P'(A*) = ua? Ni(Ni +N)? |G [7 exp [Nig(uo, %)} 
(42) a ie + —1 2 + 9”) 
‘| l {1 — 2Ny*us"(at + bn)} exp {—3(E + 1)} dé dn, 


where at + by is the linear function replacing+/N,(u — wo). (42) reduces at 
once to 


(43) P"(A*) = 2eN7'(Ny + N.)%!**? | G | up? exp{Nig(uo , v0)}. 


An expression similar to (43) may be written for P”(A**) when g(u, v) is 
replaced by g(—u, —v). By the same methods, 


k-1 


D(S*|2,) = (2)? N@-P? I | 0 (Qi +--+: + ri | ie|7 
i=l 
\ -* 2 


(k—1 
2 (Ar ove + HB A)OL + e+ HF Asar)0j"? 


j=l 
-exp {Ng(wi,--:, we)}; 
where 
i = N,/N, 


k—1 


g(wi, +++, We) = log. [I |: 7. Or tee) HANA Fees + dj41) W; . 
j=l 


\ j=l 


k—1 , dj 
“f< —d5* (1 + es + dj-1) Wj-1 + 2, Rees Wa + ws) | ’ 


a=} 


wt, ++: , we are the values for which g has a true maximum, and G is the matrix 
of the second-order partial derivatives of g, each multiplied by a factor (—1) 
and evaluated at the point where g takes its maximum value. G is positive def- 
inite due to second-order maximizing conditions. 

We have thus presented a method of approximating the correction factors 
for a large class of density functions, f(u), and for large values of the sample 
sizes. In particular examples it will be easy to see if the density function satisfies 
the mild conditions imposed. For small sample sizes, it is more often possible 
to evaluate the correction factor exactly. 
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A BAYES APPROACH TO A QUALITY CONTROL MODEL! 


By M. A. GirsHick AND HERMAN RvBIN? 
Stanford University 


Summary. This paper deals with a class of statistical quality control proce- 
dures and continuous inspection procedures which are optimum for a specified 
income function and a production model which can only be in one of four states, 
two of which are states of repair, with known transition probabilities. The 
Markov process, generated by the model and the class of decision procedures, 
approaches a limiting distribution and the integral equations from which the 
optimum procedures can be derived are given. 

1. Introduction. A machine which is producing items possessing a measurable 
quality characteristic x can be in one of four states. In state i = 1, 2 the machine 
is in production and is characterized by a probability density f;(x) of the quality 
characteristic xz. In state 7 = 3, 4 the machine is being repaired, having previ- 
ously been in state 7 — 2. The machine remains in the repair shop for n; time 
units, where a time unit is taken as the length of time required to produce one 
item. Repair puts the machine in state 1 which is assumed to be the desirable 
state. When the machine is in state 1 there is a constant probability g that in 
the next time unit it will go into state 2. This probability is inherent in the 
production process and is assumed to be known. Once the machine enters state 
2 it remains in this state until it is brought to repair (i.e., state 4). The machine 
is brought from production to repair by a statistical quality control rule R 
based on observations on z. 

Two cases are considered. In case 1 it is assumed that 100% inspection of 
the items is based upon grounds other than inspection costs. In this case the rule 
R specifies only when to terminate production and put the machine in the repair 
shop. In case 2 inspection costs are taken into account or alternatively 100% 
inspection is precluded by the destructiveness of the tests so that the rule R, 
in addition to being a “‘stop”’ rule, also specifies which items in the production 
sequence are to be inspected. In both cases, the aim is to maximize the long run 
average income. Case 1 will be discussed first. 

2. Optimum quality control rule for the case of 100% inspection. The eco- 
nomic considerations involved in the production model in the case of 100% 
inspection are (a) a function V(x) which gives the income per item of quality 
x produced, and (b) two positive constants c;, 7 = 3, 4, which represent the 
cost of repair per unit of time the machine is in state j. 

For the given production model, income function, and repair costs, the ex- 
pected income per unit of time the production process is in operation in any 
specified length of time depends on the particular quality control rule employed. 


' Presented to the Institute of Mathematical Statistics on December 27, 1950. 
* Research done under the sponsorship of the Office of Naval Research. 


114 





QUALITY CONTROL MODEL 115 


For any rule R, let Jy(R) stand for the sample average income per unit of time 
if the production process has been in operation for N time units and initially 
the machine is in state 1. Furthermore, let run (k = 1, 2, 3, 4) be the number 
of time units in the N time units that the machine is in state k. Then 


: 1 1 T3n T4n 
2.01 I in 7(x) + de we : 
— wR) l Z, V@) N X, V@) 7" = 


Let E[V(z) | fi] stand for the expected value of V(x) given that the machine 
is in state i = 1, 2, and let my = E(rin/N). Then from (2.01) 


(2.02) ElIx(R)] = mwyE[V(x)\ fil + mwE[V(x) | fo] — wanes — mints. 


A rule R* will be called optimum or Bayes if it yields maxg limy.. E[J5(R)} 
That is, letting 7(R) = lim yi. E[Jy(R)], then R* is defined by 


2.03) I(R*) = max I (R). 


In spite of the apparent complexity of the problem, the characterization of R* 
turns out to be fairly simple. 

Let 2, %2,...be the quality of the items produced in sequence from the 
time the machine comes out of the repair shop. Define 


. od So(xx) a = 
(2.04) y= (1 — g) fit)’ Zo = 0, Ze = yx(l + Ze). 

For any positive constant a let R(a) be the rule which states that inspection is 
to continue as long as Z, < a, and inspection is to terminate and the machine 


is to be put in the repair shop as soon as for some k, Z, > a. Furthermore, let 
a* be such that 


(2.05) I(R(a*)) = max I(R(a)). 


The optimum quality control rule R* is completely characterized by 
TuHeoremM 1. R* = R(a*) if there exists a constant a such that 


E(V(x) | f2) < I(R(a)). 


The following definitions and lemmas are required to prove this theorem. 

For any rule RF in use, the time period during which the machine, having left 
the repair shop, stays in production until it is placed back in the repair shop 
and stays there for the specified length of time, will be called a cycle. Let n be 
the number of time units the machine is in production under the rule R during 
a cycle and let m = n + n; wherej = 3 if the machine entered the repair shop 
from state 1 and j = 4 if it entered the repair shop from state 2. Thus m is a 
random variable and represents the length of the cycle. Let u = V(z) in any 
time unit the machine is in production and u = —c; (j = 3, 4) in any time unit 
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the machine is in the repair shop. Let J,,(R) be the total income per cycle. Then 

Jn(R) = > u;. A rule Rf will be called Bayes if it yields max EJ,,(R). That 
=! R 

is, setting J(R) = EJ,(R), Rf is defined by 


(2.06) J(RT) = max J(R). 
R 


Let R(a) be defined as above and let @ be such that 


(2.07) J(R(a)) = max J(R(a)). 


Lemma 1. RT = R(d) if E(V(z) | fe) < 0. 

Proor. The proof of Lemma 1 follows directly from the general character- 
izations of Bayes solutions given by Arrow, Blackwell, and Girshick [1] and 
only a brief sketch of the argument will be presented here. 

If at any time that the machine is in production it were known that it is in 
state 2, by the conditions of the lemma it would pay to place it in the repair 
shop. Thus the only relevant information obtainable from the observations is 
the a posteriori probability that the machine is in state 2. Whether or not for a 
given a posteriori probability the expected income per cycle is maximized by 
placing the machine in the repair shop depends on the existence or nonexistence 
of a continuation rule which from this stage on would guarantee an expected 
income exceeding the expected cost of repair. It is proved in the paper cited 
above that the set of a posteriori probabilities for which the best procedure is 
to take a given action is an interval. In the case under consideration, the set of 
a posteriori probabilities for which the best procedure is to put the machine in 
the repair shop is an interval from g* to 1, where g* is a nonnegative fraction 
and its value depends on V(x), c3, c,, and g. The optimum procedure RT can 
therefore be described as follows. At each stage of inspection compute the a 
posteriori probability that the next item will be produced in state 2. As long as 
this a posteriori probability is less than g* continue inspection. However, as 
soon as this a posteriori probability equals or exceeds g*, terminate inspection 
and place the machine in the repair shop. That this procedure is equivalent to 
R(da) can be seen from the following. 

At the kth stage of inspection, let g, be the a posteriori probability that item 
i + 1 will be produced while the machine is in state 1. Then 
(2.08) ieee (1 — g)qu-sfi(ve) 


ee LA Lee “ go = 1 — q- 


~ qu-afi(te) + (Ll — qu—a)folae) 


Let 4», be defined as in (2.04). Then 


(2.09) 


(2.10) 
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Let 


(2.11) ea (4 Pe L.). 
g9 \u %I-g 


Then from (2.10) 
(2.12) Ze = ye(l + Ze-1), Zo = 0. 
Let now p, = 1 — q. Then from (2.11), 


ot g(Ze + 1) 
2.13 =-; ‘ 
(2.13) eS) 

<a 


Consequently the relationship px S g 


is equivalent to the relationship 


* 
ae. es _ a 
(2.14) a= TF wa, 
g(l — g*) 
Thus, RT is equivalent to the rule, continue inspection as long as Z, < 4, 
and terminate inspection as soon as Z, > 4. 
Lemma 2. For any positive constant a 


(2.15) P(m > m| R(a)) > 0 as m > &, 


where m = n + n; is the length of a cycle. 

Proor. It suffices to show that the lemma holds for n, i.e., that R(a) terminated 
production with probability 1. It is clear that P(y > 1|f2) = r > 0. Let m 
be a large positive integer and let « < mo be any integer such that m — i > 


[a + 1] = d, where the symbol [¢] stands for the smallest integer greater than 
or equal to ¢. Furthermore let P(i, 2) stand for the probability that the machine 
is in state 2 after 7 time units of production and let P(i, no | 2) be the probability 
that there has been a run of at least d y’s each greater than 1 between the time 
period 7 and the time period mp given that the machine is in state 2. Then 


(2.16) P(n < | R(a)) > P(i, 2) P(i,m|2) > — (1-—g)') A -— a —?r’)4, 
where k = [(no — 1)/d]. Setting i = ano, equation (2.16) becomes 

(2.17) P(n < no| R(a)) > (1 — 8f"*) (1 — 8") > 1 — AB", 

where 0 < 6 < 1. Thus 

(2.18) P(n > m) < Ad”, 


which proves the lemma. 

Lemma 3. For any positive constant a, E[m | R(a)| < ©, where m is the length 
of a cycle. 

Proor. Again it suffices to show that E[n| R(a)] < «. In view of (2.18) 
the series rt P(n > k) is convergent. But 


ao a 


(2.19) > Pn>k) = > P(n =k) = DL jP(n = 5), 


kel jack 7=0 


which proves the lemma. 
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Lemna 4. For any rule R with E(m|R) < «©, I(R) = El %wi/E(m | R)), 
where I(R) = limye EI y(R). 

Proor. Let Wy be the number of completed cycles in the time period of length 
N and Ly be their total length. A repeated application of the Strong Law of 
Large Numbers shows that with probability 1 the following sequences approach 
the indicated limits as N - «: 


N _ Ly Ly , 
—__—— —>0); ) ae op len) - 
W. ; (b) Vv. E(m); 


(c) 7 — E(m); (d) | > U; wx |—0, 
N i= 1 


LN 


(a) 


(2.20) 


(e) EB | ra(R) _ _ u;/(Wwy B(m)) | — 0; 
t=1 


(f) E [> ta/(W x EX) | +E | U; B(mn) |. 
i=] 


t=1 


Therefore, 


(2.21) EI,(R) -— E bE u/ Eto) | —0 
i=1 


as N — o, which completes the proof. 

Lemma 5. If for any rule R, E(m|R) = @, then I(R) = E(V(z) | fe). 

Proor. To prove this lemma it will suffice to show that the proportion of 
time units in which the machine is in state 2 in N time units approaches 1 as 
N — ~.If as N — o there are only a finite number of cycles, then with prob- 
ability 1 the machine will enter state 2 and remain in state 2 so that the lemma 
is established. Assume that this is not the case. 

Let s; (¢ = 1, 2,---) be the number of time units required for the machine 
to enter state 2 in the 7th cycle if no stop rule were employed plus the number 
of time units it stays in the repair shop. Then s;, s:, --- are identically and 
independently distributed variates with Es; < (1/g) + max (nz, m4). Let ¢; (¢ = 
1, 2,---) be the length of the 7th cycle. Then t,, #, --- are identically and in- 
dependently distributed variates with ¢; independent of s,, %, -+-, Sin, 
Sin1, ++» Define 


(2.22) = t; -_ if 


= 0 if 


Then z, = number of time units in the ith cycle that the machine is in state 2. 


The fraction of time units that the machine is not in state 2 in the first n cycles 
is given by 


> &, 
bs &. 


' (2.23) ¢ . 82522 
hth+o- +h 
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where 

= tj if tj < Si, 
2.24) 

= $; if i> &. 


Combining (2.22) and (2.24) yields 
(2.25) z+ wy; = t;. 


Now by (2.24) Ew; < Es; < (1/g) + max (nz, m4), and since by assumption 
Ez; = ©, it follows from (2.25) that Et; = «©. By (2.23) and (2.25), 


Q. = UW + w2 + °°: + We ue 
Sate bette + we 


w+ -*° +, 
= ijialhiariegpttininate a aaipensneiapandtine 
7 i ee 


n n 


so that by the Strong Law of Large Numbers, lim,...Q, = 0 with probability 1. 
Let 


n n+l 
(2.27) > (wi +2) <N< DY (w+ 2). 
t=1 i=1 


Then the relative lengih of time that the machine is not in state 2 in N time 
units is given by 


n 


{3 -- w; + length of time that it is not in state 2 between 7. (w; + z;) 
i =1 =I 


> 

UW; 

and v\ <- ae mp Q 
2 (w; + 2) 


with probability 1. This proves the lemma. 

The above lemmas will now be used to prove Theorem 1. 

By Lemma 5 the only rules R that need to be considered are those for which 
E(m|R) < o. Let 


2.29) K(R) = Ab» (uy — Ra") |. 


The income function u — J(R(a*)) satisfies the conditions of Lemma 1. Hence 
there exists a constant d@ such that 


2.30) K(R(a)) = max K(R(a)). 
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Now by Lemma 4, 

(2.31) K(R) = E(m| R){T(R) — I(R(a*))). 

Since by Lemma 3 E(m | R(a)) < ~, then for all R 

(2.32) K(R) < K(R(4@)) = E(m| R(@)) {I(R(a)) — 1(R(a*)))}. 


But by definition 7(R(a)) < J(R(a*)). It follows therefore that K(R) < 0 
so that by (2.31) J(R) — I(R(a*)) < 0, which proves the theorem. 

By Lemma 3, a sequential procedure defined by the rule R(a) terminates with 
probability 1. It is of interest to investigate under what conditions R(a) is a 
truncated sequential procedure. The answer to this is given by 

THEOREM 2. Let ro be the least upper bound of numbers r such that P(y <r | fi) = 
0. A necessary and sufficient condition that R(a) be a nontruncated sequential 
procedure is that rp < a(1 — 1). 

Proor. Since y = f2(x)/((1 — g)fi(x)) = 0, it follows that ro > 0, so that if 
the condition of the theorem is satisfied rp) must be less than 1. Assume that 
ry < a(1l — 1). Let ry > ro but still satisfying the condition r; < a(1 — 1). 
Then P(y < r; | fi) > 0. Thus for any finite n whatever, there is a positive prob- 
ability that (a) the machine is in state 1 during the n time units and (b) there 
exists a sequence y; , --- , Yn Such that y; < 7 for all 7. But for such a sequence 
of ’s inspection cannot terminate since Z; < 7 af; < @forj = 1,---, n. 
Conversely if rp > a(1 — ro), the series _ i_sry > a. Thus there exists an no for 


which >> 2°, rj > a, which implies that P(n > no) = 0. This completes the proof 
of Theorem 2. 


3. Integral equations for the Markov process in the case of 100% inspection. 
In the previous section it was shown that the optimum quality control rule is 
given by R(a) for a = a*. The problem is to find a* for a given V(z), cs and ¢. 
Since a* is that value of a for which J(R(a)) = lim,.2£ Ely(R(a)) is a maxi- 
mum, this problem will be solvable if 7(R(a)) is determined for an arbitrary a. 
But by (2.02) this is equivalent to finding for any a limy «© men (kK = 1, 2, 3, 4). 
The solution to the latter problem will be given in this section. 

The production model under consideration together with the stop rule R(a) 
creates a Markov process with states and transitions which can be represented 
schematically as follows: 

(1, Z,): If Z, < a, take another observation; if Z, > a, go into [3, 1]. 

(2, Z,): If Z, < a, take another observation; if Z, > a, go into [4, 1]. 

(3, k] > [3, k + 1] be l--: = +o 

{[3, ng]: Take an observation. 

(4, k] > [4,k + 1] : es ym— 1), 

(4, m4]: Take an observation. 

Here the symbol (7, Z) 7 = 1, 2, stands for the joint event, the machine is in 
state 7 and the event F occurred. The symbol [j, k], (j = 3,4, k = 1, ---, m5), 
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stands for the event, the machine is in the kth time unit of repair and prior to 
repair has been in state 7 — 2. The arrows indicate transitions from one state 
to another. 

From the results of a paper by Erdés, Feller, and Pollard [2], it follows that 
the probability that at time m the machine is either in state [3, m3] or [4, m4] 
approaches a limit, and hence the probability of the event (i, Z, 5 a) or the 
event (i, Z, ¢ S) where S is any Borel set, approaches a limit. 

While in the Markov process under consideration it was assumed that when 
the machine leaves the repair shop Z) = 0 and p, = g, where p, (see (2.08)) 
is the a posteriori probability that item k + 1 will be produced in state 2, it is 
found just as convenient to derive the integral equations for the limiting dis- 
tribution for any arbitrary value for Z) and a corresponding a posteriori prob- 
ability h. 

Write Z for Z, , w for Z,., and y for y,. Then Z = y(w + 1). For any z > 0 
let P(i, Z < x) stand for the probability of the event (7, Z < z). 

Lemma 6. If P(y = & | fe) = 0, then 


(3.01) P(2,Z <2) = | (1 +2) dP(1,Z <2). 
i 


Proor. The truth of this lemma can readily be seen from the fact that if ¢ 
is written for Z, and p for p, then 


toe (1 + O, 
rg 


as can be verified from equation (2.13). 

In what follows, it will be assumed that P(y = = | f.) = 0. This will reduce 
the problem to that of finding an integral equation only for P(1, Z < z) as will 
be evident from the following equations: 


- P(i,Z <x) = (1 — g) P(l, w < a, yw + 1) < 2) 
_ + (1 — g) (1 — h) (eu + ») P(y(Z + 1) < 2z{ 21), 
(3.03) P([3, k] = » = P(1,Z < &) — P(1,Z <a), (k= 1,--- 
(3.04) Pl4, kl] = » = P(2,Z < ©) —P(2,Z<a), (k=1,-:: 
(3.05) Pd,Z< «)+ P(2,Z2 < ~) + nyu + ny = 1. 


Let G,(y) be the cumulative distribution of y given state 1. Then from (3.02) 


Me Oe at oo it om [« +.) "1 Z< 
PU, 2 < x) 1 fal, aP(1,Z < 2) 


+ (1 — g)(1 — hu + »)G; (uz). 


(3.06) 
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Interchanging order of integration yields 


z/(1+a) 
P(i,Z <2) = (1 — g) | [ P(1i, w < a) dG,(y) 
0 


(3.07) + r Pd, w < (z/y) — 1) acs | 
z/(i+a 


+ (1 -—g)(1l — Au + v)G, (25) 


z / } ideas 
a (1 po g) | Pa, w< a)G; (=) + (1 — h)(u + nes (; + z) | 


+-g) [| PU,w< @/y - DdGy). 


In terms of the quantities defined in (3.01) to (3.05) limy..mw with h = g 
is given by 


(3.08) lim Tn = P(1, Z < a) + (1 — gu _ v), 


N-—-@ 


(3.09) lim Ton = PC. a < a) a g(u a v), 


N—-2 


(3.10) lim 73~ = 3M, 


N-2 


(3.11) lim my Nav. 


Ns 


The computation of the quantities involved in (3.08) to (3.11) can be simpli- 
fied by the following device. 

Let p = » + »v. Assign a value to p, say p’. Solve for P(1, Z < x) from either 
3.06) or (3.07). Call the solution P’(1, Z < x). Compute P’(2, Z < x) from 
(3.01). Compute yu’ from (3.03) and v’ from (3.04). Compute 
(3.12) D=P'(1,Z < ©) + P(2,Z2 < &) + ny’ + ny’. 

Then 
PU, Z < 2) 
D ‘ 


P'(2, Z <2) 


(3.13) P(1,Z <2) = 
(3.14) P(2,Z <2) = 
(3.15) 


4. Optimum quality control rule when inspection costs are considered or 
when tests are destructive. As was previously pointed out, in case inspection 
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costs are taken into consideration, or when inspection is destructive, a quality 
control rule R must not only specify when to stop inspection and put the ma- 
chine in the repair shop, but it must also specify which items are to be inspected. 
That is, R must be a continuous inspection plan as well as a stop rule. 

The income considerations involved in the present situation are somewhat 
different from those in case 1. To begin with, the cost of inspecting an item has 
to be specified. In addition, the income from an item may depend not only on 
its quality but also on whether or not it has been inspected. This is obvious 
if inspection destroys the item. But even if the tests are not destructive, throw- 
ing away or repairing a defective item, for example, may involve a different 
cost consideration from that of selling a possible defective item with a resulting 
loss of good will, ete. 

To distinguish between the two types of income functions, let Vo(x) be the 
income of an uninspected item of quality z and V(x) be the income of an in- 
spected item of quality x. It may be assumed that inspection costs have already 
been reflected in V(x). In addition let c; (7 = 3, 4) be, as above, the cost of re- 
pair when the machine is in state 7. Again as above let J,(R) represent the 
average income per unit of time if the production process has been in operation 
for N time units and the rule R is employed. Then 


EIx(R) = maw E{Vo(z) | fil + maw E[Vo(x) | fo] + mw E[V(z) | fil 
(4.01) 


+ mw E[V(x) | fo] — wanes — manta, 


where in the N time units moiv (¢ = 1, 2) is the expected proportion of time 
units in which items are not inspected and the machine is in state 7, iy (¢ = 
1, 2) is the expected proportion of time units in which items are inspected and 
the machine is in state i and w;y (j = 3, 4) is the expected proportion of time 
units the machine is in state 7 (i.e., repair). 

A rule R* will be called Bayes if it yields maxe limy.. EI (R). 

Without going through the details of the argument, which are similar to the 
case previously considered, the optimum rule R* is characterized as follows: 
Let 


\ So(2n) 
( f 2 Un = a 
— (1 — g)fi(n) 


if in the nth stage of production the nth item is inspected, and let 
(4.03) 


if in the nth stage of production the nth item is not inspected. Let 
(4.04) be = Yn l + Zn-1); Zo = (). 


Assume that when the machine leaves the repair shop the first item is not in- 


Cee ee 
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spected. Then for suitably chosen positive constants a* and b* with b* < a*, 
R* is the rule which states that items are not inspected as long as Z, < b*. 
Inspection begins as soon as Z, > b*, and inspection continues until either Z, < 
b* or Z, > a*. In the former case production continues but inspection terminates, 
in the latter case inspection terminates and the machine is put in the repair 
shop. 

It is to be noted that whenever for some m , Z,,, < b*, the number of items to 
be skipped is completely determined. For if k is the number of items to be 
skipped, then k must satisfy the equation 


(4.05) ZLnotk = ( -~} + ) bey ee 


j=l P= ¥g 


Summing the above equation and solving for k yields 


b* + 1 
(4.06) k= | tog (FA) / - ee ql — |, 


where the symbol [¢] stands for the smallest integer greater than or equal to ¢. 
The interesting fact is that R* prescribes that inspection or noninspection shall 
occur in batches of items. 


5. Integral equations for the Markov process in case inspection costs are 
considered. The integral equations for the limiting distribution of the present 
Markov process are understandably more complicated. They are obtained as 
follows: 

As in the previous case, let Z) be arbitrary and let h be the corresponding a 
posteriori probability. Let Z = y(1 + w), where y is defined by (4.02) and (4.03). 
For any arbitrary a and b (a* and b* are obtainable by a maximization process), 
let the symbols P(1, Z < x), P(2, Z < x), and Pli, k] have the same meaning 
as in the previous case. Then 


(5.01) P(2,Z <2) =. (1+ 2) dP(,Z <2), 


ae i 


P(1,Z < zx) = (1 — g) ‘P(,, w< b, < r) 
\ =e 


‘ ) 
+PHi,b<w<ayw+ I<2)4+(1—hA)(u +) (4 2 a e), 


l-g 
(5.03) P[3, k] = » = P(1,Z < «) — P(1,Z <a), (k 

(5.04) Pi4, k] = » = P(2,Z < ©) — P(2,Z <a), (k= 
(5.05) PO,Z< «)+ P(2,Z2< ~) + n+ muy = 


Let G,(y) be the cumulative distribution of y for an inspected item given that 
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the machine is in state 1. Then the integral equation for P(1, Z < 2) is given by 


PO, Z <2) = (1 — g) min [P(1, w < b) P(l, w < (1 — g)x — 1)) 


+ (1 — Au +o) (« - “a+ t)), 


where 7 = 1| if argument is positive, » = 0 otherwise. 
Equation (5.06) can also be written as 


rc 


PU,Z<a2)=(1- g)| min (P(1, w <b), PU, w < (1 — gx — 1)) 


tee raz os ie ‘) | 


‘—¢ 
(5.07) 


+ (1 — g) [a (-,) (P(1, w < a) — P(l, w < b)) 
l+a 


z/(1+b) (¢ \ 
+ / iP (1, e<ia~ 1) — Pil,w< w)} ae) |. 
z/(1+a) \ ¥y 


The limiting probabilities involved in (4.01) for this process are 


(5.08) lim mow = Pil, Z< b) + (1 = g) (u + v), 


N-o 


(5.09) lim To2n - P(2, Z « b) +- gu + v), 


N—-% 


(5.10) lim my = P(1,Z < a) — P(1,Z < 5b), 


N-o 


(5.11) lim may = P(2,Z < a) — P(2,Z < b), 


N—-2 


(5.12) lim m3v = N3u, lim maw = Ny. 
N-2 N-o 
. . . . e . 
The previous remarks about computing the integral equation apply to this 
case also. 
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TESTING A STRAGGLER MEAN IN A TWO-WAY 
CLASSIFICATION USING THE RANGE 


By Jack MosHMAN 


Oak Ridge National Laboratory 


1. Summary and introduction. The use of the range in place of the standard 
deviation as a measure of dispersion has long been recognized as a convenient 
and easily calculated statistic. Possibly its most notable employment has been 
in the industrial statistician’s quality control charts. A statistic based upon the 
range is described below which may be used to test whether one of a group of 
means may be considered to be a straggler from all or some of the others in a 
two-way analysis of variance. 

2. Previous literature. Nair [1] derived the distribution of u, = | x, | /8 

k 


and u, = |Z — 1 |/s,, where 7; < %<-°:-< 4%, = >» zi/k and s, is an 
t=1 

independent estimate of the standard deviation based on v degrees of freedom 

from a normal population. 

If x, and a, are considered row or column means in a two-way analysis of 
variance, the grand mean, and s, the error root-mean-square estimate of the 
standard deviation, u, or u,’ may be adapted to examine an individual sample 
mean as a possible straggler from the grand mean. In this case vy = (r — 1) 
(k — 1) if there exist r rows and k columns and the statistic takes the form 
uy = (te — ¥)V/r/s,, where x» is the largest of the k column means. Nair pro- 
vided critical values of 1, and u,’ for k = 3(1) 9 and v = 10(1) 20, 24, 30, 40, 
60, 120 and «. 

Tukey [2] suggested an empirical approximation to Nair’s procedure which 
does not involve the use of any special tables. He showed that 


— 6 yr . 
w = We a _$ logiok k > 3; 


may be treated as a normal deviate and will test the most deviant mean from 
the grand mean as a straggler. 

3. The g-statistic. Hartley [3] recently prepared, as an alternative to the F 
ratio, a test based solely on the range to examine row and column homogeneity 
in a two-way classification. It is proposed here to develop a statistic, also based 
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on the range, to examine one of a group of means as a possible straggler from 
the grand mean. 


Consider Nair’s statistic applied to column means: 
a. (ry ~- Z) v/r 


Se ——___—__——— , 
8, 


It is well known that s, may be approximated by the range. Let us consider the 
usual probability model 

Lij = Gs + B; + €ij, 
where a; and §; are the effects of the ith row and jth column respectively, and 
the ¢;; are error variates distributed as N(0, 0°). Then 

Qty. = ay; + B + €i. 5 


whence 
rij — 2, = (8B; — B) + (es — &.), 


which we shall call row residuals. In the jth column, the range of the row resi- 
duals is equal to the range of the r independent normal deviates (¢;; — €:.), 
distributed as N(0, (k — 1)o’/k). We may define @,,, as the mean of the k ranges 
(k columns) with r observations in each. Patnaik [4] showed that the distribu- 
tion of ®,,,/(ocx,7) may be approximated by that of x/+/»’, where c;,, is an ap- 
propriate scale factor and vy’ is the “equivalent number of degrees of freedom.” 

This is done by equating the first two moments of @,,-/(¢ . c,-) to those of 


x/Vv': 
, 
v+il 
/2 r(’ tt) 


7? 


(2) V [= “) Ve(k — ft + (& — 1) pel 
2 Lio - — iF 


—; 
OLk.r kts.» 


where d, and V, are the population mean and variance respectively of the range 
in samples of r from a normal population with unit variance and p,, is the cor- 
relation between any two column ranges, derived from the row residuals. Hart- 
ley ({3], p. 276) provides a table of p, , and V, is tabulated in Pearson [5]. But 
since s,,/o is distributed as x/+/»’ we may let s,, = ,,-/cy,, after determining 
v’ and ¢,,, from the solution of equations (1) and (2). 

From the theory of the analysis of variance, it is known that 


Ge a = €j + é), 
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(ce; — €) and (€; — é@) are independently distributed ((6], p. 344). Since @,., 
is a function of (€;; — €;, — €; + €) only and (a, — #) and (z,, — Z) are func- 
tions only of (e€; — @) and (¢;, — é) respectively, it follows that @,,, (x. — Z) 
and (z,, — #) are also independently distributed. 

Returning to Nair’s statistic we have, after substituting @,,-/c.,, for s,, 


jar —2|Vr _ |ta—2|V reer _ 
= ~ a = Uy. 
8, Wkir 


uy = 


By replacing all means with summations we have 


r k r 
VT Uy: kote —- DD Dw 


i=] j=l i=1 


(3) g 


Ckyr Whe 

where W,,, = k %&,- = -— w”, and w” is the range of the r residuals in the 
jth column. The g-statistic is now in its simplest form. Obviously the distribu- 
tion of g is the same for x, or x, . 

W,.., is the sum of the k column ranges of the row residuals of r observations 
each. Alternately one may use W,,, by first determining the column residuals 
and then summing the r row ranges. In order to enjoy the maximum number of 
degrees of freedom it is advisable to use W;,.,, or W,,, according as r or k is greater 
respectively [3]. Letting 1 = max (r, k) and s = min (r, k), then W,,, is the pre- 
ferred denominator. Since the range of the residuals is independent of the nu- 
merator, there will be no advantage in preferentially selecting W,, or W<z,, 
on the basis of the numerator. In some cases, however, the arithmetic processes 
will be simpler when using W,,, and outweigh the advantages of slight increase 
in the equivalent number of degrees of freedom. 

Four alternate forms of equation (3) are possible by interchanging r and k 
in the numerator and the denominator independently. When k > r we define 


k r k 
r > ty od z. Vij 
j=l i=1 j=l 
ee ‘Wer . 


to test whether the rth row mean is a straggler, and 


r r k 
kD tw — > 2 te 


t=1 j=l 


Wsl 


to test whether the kth column mean is a straggler from the grand mean. 
The notation g;; (read g-parallel) indicates the means are compared in a 
manner parallel to the determination of residual ranges and g, (read g-perpen- 
dicular) indicates means are compared perpendicular to the array of residuals 
when their range is determined. Specifically if k > r, then W,,, is determined 
by subtracting column means and determining ranges horizontally along the 
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rows. Hence to test row means we work parallel to the range determinations 
and perpendicular for column means. 

In Table 1 are tabulated the 5 per cent and 1 per cent critical values of gj; 
and g, when W,,, the preferred measurement, is used in the denominator. 
g\; is found above the main diagonal, g, below; the two coincide on the main 


TABLE 1 
1 per cent and 5 per cent critical values* of g,t and gf using denominator W,,; 
m 8(91) 
i L(g) 


7 
\ 


l(gs) * 
(gi) be 


3.05 
.40 


.80 
19, 


2.62 


2.06 2.1. 2.2% .38 


2.50 . ‘ .86 .02 
97 2. 2. dl 45 


2.40 .48 6 ae 2.93 
1.92 2.06 13 .76 2.40 


* Upper figure for a = 1 per cent, lower figure for a = 5 per cent. 
t g on, and below main diagonal, g; on and above main diagonal. 


diagonal. Table 2 is similar to Table 1, but W,,, is the denominator and g), is 
now on and below the main diagonal and g, is on and above. In each case s 
is found along the top stub and / along the side when the statistic is g, . Their 
positions are reversed for g 

4. Application. As an illustration of the use of the g-statistic, we may consider 
an example given by Rider ({7], p. 147), which appears in Table 3. 

We wish to test whether hazel does actually flower significantly earlier than 
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at least some of the other plants. We first form the row residuals by subtracting 
each station mean from the plants at that station and compute the column 
ranges as in Table 4. 


TABLE 2 
1 per cent and 5 per cent critical values* of g,t and g,f using denominator W,,, 
Ns 
*\ U(gu) 


l(g) 


3.50 
2.40 


‘ 3.28 
2.47 
3.04 14 
2.41 2.51 


2.84 2.8: 2.8 2.97 3.06 
2.10 2.16 2. 2.37 2.47 


2.45 2.44 2.8 2.92 3.02 
9 


2.05 2.13 2. 2.33 43 


* Upper figure for a = 1 per cent, lower figure for a = 5 per cent 
+g, on and above main diagonal, gy on and below main diagonal 


We find that 1 = 6,s = 5and W,., = 150.8. To test whether hazel is a straggler, 
we run parallel to the range layouts and have 


9), = 12650) = 7176) _ o693, 
, 150.8 

Since W,., was used in the denominator, we refer to Table 1 and above the 
main diagonal. Now / runs along the top and s along the side stub. The 1 per 
cent critical value for 1 = 6 and s = 5 is 3.27 and we conclude that hazel is 
indeed a straggler with level of significance a << .O1. 

If we wished to test whether Bratton is a straggler from the other stations, 
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we would be running perpendicular to the range layout and would have 


Referring to Table 1 below the main diagonal, we see that the 5 per cent critical 
value is 2.34 and the difference is not significant. 
TABLE 3 
Day of year of flowering of five plants at six stations 


Plant 
Station 


Broadchalke 
Bratton 
Lenham 
Dorstone 
Coaley 
Ipswich. . 


Totals 


Hazel 


131 

84 
131 
106 


—-—* 


4d 


12] 


650 


Colts 
foot 


205 
176 
196 
194 
190 
179 


1140 


Ane- 


mone 


Black- 

thorn 
274 
276 


262 


299 
291 
299 
317 
298 
293 


1797 


Mus- 
tard 


337 
318 
333 
344 
332 
328 


1992 


Totals 


1246 
1145 
1221 
1200 
1172 


1192 


Means 


249.2 
229.0 
244.2 
240.0 
234.4 
238.4 


7176 
* This figure was misprinted as 777 in Rider [7] 
TABLE 4 
Station residuals and variety ranges from Table 3 
Plant 


Ane- 


mone 


Black- 


thorn 


Hazel Coltsfoot Mustard Totals 
Broadchalke 
Bratton 
Lenham 
Dorstone 
Coaley 
Ipswich 


—118.: 
— 145. 
—113.2 
— 134.0 
— 157.4 


—117.4 


~~ $4.7 24. 19.8 
—53. 62.0 
— 48 .: 8 54.8 
—. ‘ 77.0 
— 44. ) 


87.8 
89.0 
88.8 
104.0 
63.6 97 .6 
54.6 89.6 
Range 48.0 16.2 150.8 

5. Remarks. In the example, if station residuals and plant ranges were com- 
puted, Table 2 should have been used. Ws. has 18.5 degrees of freedom asso- 
ciated with it; W.,5 has 18.2 degrees of freedom, a slight difference. No different 
in significance of the test results would have been obtained. As a check on the 
procedure it was found that 5 6/¢s,5 = 13.11 and i 5/c6,5 = 14.93. The value of 
o estimated by s in the illustration was 13.99, a reasonably good check. 
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In the event that , = k, one has the same number of degrees of freedom, but 
in general W,, # W;,,-. The difference will be a sampling fluctuation of the 
e;; and will ordinarily make little difference except when g lies close to one of 
the critical values, but in practice, one makes little differentiation between 
levels of significance of 6 per cent and 4 per cent. 

Acknowledgment. The author wishes to express his indebtedness to Professor 
John W. Tukey for indirectly suggesting this problem, and for some direct 
recommendations concerning the notation used in this paper. 
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NOTES 


NOTE ON WILCOXON’S TWO-SAMPLE TEST WHEN TIES 
ARE PRESENT 


By J. HEMELRIJK 
Mathematical Centre, Amsterdam 


Wilcoxon’s parameterfree two-sample test (cf. Wilcoxon [1]; H. B. Mann and 
D. R. Whitny [2]) depends on a statistic U with the following definition: If 
%1,°**, % and y:,-°+:, Ym are the two samples, U is the number of pairs 
(t, 9) with 2; > y;. The probability distribution of U, under the hypothesis 
that the samples have been drawn independently from the same continuous pop- 
ulation, has been derived by Mann and Whitney. The influence of ties on this 
probability distribution has not been investigated as yet. 

It is noteworthy that Wilcoxon’s U is closely connected with the quantity S, 
which Kendall (cf. e.g. Kendall [3]) introduced in the theory of rank correlation. 
When r pairs of numbers (u, , v4.) are given, S is computed by scoring: 


—1, if (u, — ux) (vn — 1%) < 
0, if (up, ans Ux) (vp ~~ Up) = 
+1, if (u, — we) (vn — %) > 


and adding the scores for all pairs (h, k) with h < k. If, in this definition, we 
take r = n + m and substitute the values 7,,---, 2%, ¥1,°**, Ym in this 
order for uy, +++ ,Un,Ungi,***, Ur, and O or 1 respectively for »% if um = 2; 
for some 7 or u%, = 4; for some j respectively, then the following relation holds: 


(1) 2U + S = nm. 


The simplest way to see this is by considering the total score of 2U’ + S for 
every pair (h, k). This score is equal to +1 if »% = 0 and »% = 1, and 0 other- 
wise. The sum of the scores is therefore nm. 

Relation (1) holds if no ties are present among the two samples 2, ---, 
Yn and yi, °-*, Ym- It is natural to define U in general by extending (1) to 
the case when there are ties. Since for a pair (x; , y;) with x; = y; the score of 
S is equal to zero, the score for U must be taken as } for such a pair. 

Now Kendall has derived the mean and the standard deviation of S under 
the hypothesis that for a given order of the quantities 1, ,---, », all the r! 
possible permutations of ™,---, u, are equally probable. This condition is 
fulfilled in our case if the samples x, ,--- , 2, and y;,--- , ¥m have been drawn 
at random from the same population (which need not be continuous anymore). 
Therefore, the mean and standard deviation of U under the null hypothesis 
may be derived from Kendall’s formulas. 
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According to Kendall ([4], pp. 56 and 60), we have 
(2) E(S) =0 


and 


var (S) = yg {r(r — 1)(2r + 5) — D et — 1)(2t +5) 


(3) — Dale — 1)(28+5)} + ; tL ae - yt- 2) 


Or(r — L)(r — 


{Li als — 16 - 2} + yt (Le — NLL as — vi, 


where summation > takes place over the various ties among %,---, u,, and 
>... over the ties among», ---, v,;¢ and s respectively indicating the number of 
elements in every group of equal numbers among ™%, -- 


-, u, and 4,,°°- 
respectively. From (1) we have 


» Ur 
(4) E(U) = 4 nm — E(S) = 3} nm 


und 
(5) var (U) = 4 var (S). 


The group v;,, --: , v» consists of nm numbers 0 and m numbers 1; thus s in (3) 
takes the values n and m and we have 


> s(s — 1) (2s + 5) = n(n — 1) (2n + 5) + m(m — 1) (2m + 5), 


>, s(s — 1) (s — 2) n(n — 1) (n — 2) + m(m — 1) (m - 


2) 
a 


=r 


> s(s — 1) = n(n — 1) + m(m — 1). 


Substituting in (3) and (5), we obtain after some reduction 
var (U) = psnm(n + m+ 1) — oy a t(t — 1)(2t +5) 
t 


(6) + n(n — 1)(n — 2) + m(m — 1)(m _ 


36(n + m)(n + m — 1)(n + m — 2) 


2 
9 


> ue — 1)(t — 2) 


n(n — 1) + m(m — 


1) , 

<a Sh on len. Sy "PS YN 

8(m + n)(m + n — 1) x ( 

where >, takes place over the ties among the values 2, -- 


"5 Uny Yi>°**s Um, 
taken together. 


When no ties are present this reduces to results of Mann and Whitney [2]: 
(7) E(U) = 4 nm; var (U) = ps nm (n + m + 1). 


From (6) and (7) it is easy to prove (e.g., by induction) that var (U) is decreased 
by the presence of ties among the observations. These results constitute a first 
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step towards the possibility of using Wilcoxon’s test for samples from any 
population. 
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CORRECTION TO “ON CERTAIN METHODS OF ESTIMATING 
THE LINEAR STRUCTURAL RELATION” 


By J. NEYMAN AND Eizasets L. Scorr 
University of California, Berkeley 


We are indebted to Professor J. Wolfowitz for calling our attention to a blun- 
der in our paper under the above title (Annals of Math. Stat., Vol. 22 (1951), 
pp. 352-361). In the statement of Theorem 3 on page 358 the symbols é,, and 
£\_», should be replaced by X,, and X,_», , respectively. It will be noticed that 
this change does not affect the proof nor the implications of the theorem. 
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ABSTRACTS OF PAPERS 


(Abstracts of papers presented at the Washington meeting of the Institute, 
October 26-27, 1951) 


1. On the Law of Propagation of Error. (Preliminary Report.) CuurcHILL 
EISENHART AND I. RicHarp SAvAGg, National Bureau of Standards. 


In the main the results presented in this paper are not new, being at most minor exten- 
sions of known results. The aim is a unified treatment of the ‘‘law of propagation of error,” 


with emphasis on the practical meaning of the formulas, and attention to the details of 
their rigorous derivation. 


2. Multivariate Orthogonal Polynomials. (Preliminary Report.) L. W. Cooper 
AND D. B. Duncan, Virginia Polytechnic Institute. 


It is well known that the work of fitting a regression function, which is a polynomial in 
one variate, viz., (1) y = L{_ bir‘ can be greatly simplified by the use of orthogonal poly- 
nomials of the form (2) «, = Liaok zi. It is sometimes required to fit a regression function 
of the more complex multivariate polynomial form 


(3) 


‘i, ook Tite + o8k, 
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Following suggestions of Tyler (TyLer, G. W., ‘“‘The experimental evaluation of definite 
integrals,’’ unpublished thesis, Virginia Polytechnic Institute, Blacksburg, Va., 1949) and 
DeLury (DeLury, D. B., Values and Integrals of the Orthogonal Polynomials up ton = 26, 
University of Toronto Press, 1950), polynomials can be defined as ¢;.j,....6 = €:€) °° iso 
which effects the same simplicity for fitting the functions (3). These are termed multivari- 
ate orthogonal polynomials. Their properties are investigated and short methods for using 
them are developed. 


, 


3. An Analysis of Variance for Paired Comparisons. Henry Scuerrfé, Columbia 
University. 


In a paired comparison test of m brands of a product each of the 4m(m — 1) pairs is 
presented to 2r judges: to r in one order, and to r in the other. An analysis of variance is 
developed for the case in which the judges’ preferences are expressed on a 7- or 9-point 
scale. Account is taken of the effects of order of presentation. Main effects are defined for 
the brands. The hypothesis of subtractivity, analogous to the hypothesis of additivity in a 
two-way layout, states roughly that the results for any pair, after order effects are elimi- 
nated, can be attributed entirely to the difference of the main effects of the two brands in 
the pair. F-tests for the hypothesis of subtractivity and for the significance of the main 
effects are given, as well as estimates of various parameters and their standard errors. A 
numerical example illustrates the method. (Work sponsored by the office of Naval Re- 
search.) 


4. Statistical Theory of Fatigue Failures. E. J. Gumpert, Consultant, Stanford 
University, AnD A. M. FreupentuHat, Columbia University. 


The interpretation of the results of fatigue tests is made difficult by the fact that pro- 
gressive damage, which finally leads to fatigue failure, is a highly structure-sensitive proc- 
ess. It produces, therefore, a very wide scatter of the test results for the rational analysis of 
which the character of the statistical distribution must be known. If n specimens are sub- 
mitted to repeated stress cycles of amplitudes S they break at varying numbers N of cycles. 
The interpretation of the relative ranks as cumulative frequencies of survival for N cycles 
at different stress amplitudes leads to criteria for the consistency of the observed series. 
The frequencies of survival are reproduced by the third asymptotic probability of smallest 
values, and it is assumed in first approximation that the probability of survival reaches 
unity only for N = 0. If the decimal logarithms of N are traced on the extremal probability 
paper in descending scale at the plotting positions m/(n + 1), the two parameters in the 
survivorship function which depend upon S may be estimated in the same way as for the 
first asymptotic probability function of extreme values. The fit of the theoretical straight 
lines obtained for copper, steel, and aluminum specimens is very good. Extrapolations give 
the number of cycles for which the probability of survival for a given stress level differs 
from unity by any desired small value. (Work done in part under the sponsorship of the 
Office of Naval Research.) 


5. Analysis of Chain Block Designs. (Preliminary Report.) W. S. Connor, 
Jr., AND W. J. YoupEN, National Bureau of Standards. 


One and a fraction replications can be arranged in incomplete blocks so that there is a 
carry over of two (or more) treatments from one block to the next block. Estimates of the 
treatment and block effects and the analysis of variance have been obtained for these chain- 
block designs. 
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6. An Approximation Theorem. (Preliminary Report.) I. Ricnarp SavaaGe, 
National Bureau of Standards. 


The distribution of a function of a sample mean is studied and the rapidity at which this 
function’s distribution approaches normality is investigated. Berry’s theorem about the 
distributions of sample means is used. The theorem proved gives a remainder term in the 
case where the function is increasing and has a continuous second derivative. The remainder 
is the same as Berry’s, plus another term which depends on the second derivative of the 
function and is of size 1/+/n. 


7. Testing Multiparameter,; Hypotheses. E. L. LenmMann, Stanford University 
and University of California, Berkeley. 


Let the distributions of some random variables depend on real parameters 6; , «++ , 4, 
and consider the hypothesis H: @; S c; fori = 1, --- ,s. It is shown under certain regular- 
ity assumptions that unbiased tests of H do not exist. Tests of minimum bias and other types 
of minimax tests are derived under suitable monotonicity conditions. The two-sided hy- 
potheses H’: a; S 6; S b; , i = 1,--+ , s are discussed as well as certain related multi- 
decision problems. 


8. Analysis of a Certain Random Walk by the Monte Carlo Method. Rosertr 
Mirsky, Cornell Aeronautical Laboratory. 


A point object B(z, y) starting from (d, 0) moves toward its ultimate destination 7'(0, 0) 
in the following manner: every 7 seconds B takes a sight reading on 7’ and attempts to fol- 
low the rectilinear course until the next reading is taken; the speed of B is a constant v. 
Because of imperfections of the sight reading instruments, the actual course followed at 
each time deviates from the true line of sight BT so that in general a zigzag path is deter- 
mined. The angular errors a; , a2, --* , are assumed to be symmetric about BT’, independ- 
ent, and to have the same distribution at each time r. This random walk was studied by 
M. Kae who obtained several results which are compared in this paper with results obtained 
by Monte Carlo sampling. It is shown that while statistical analysis can be used to check the 
accuracy of the Monte Carlo process, Monte Carlo results can at the same time be used 
to determine the validity of analytic formulas whose derivation involved simplifying as 
sumptions or approximations. (This work was carried out under the sponsorship of the 
Office of Naval Research.) 


9. On Certain Estimators Based on Large Samples of Extremes. (Preliminary 
Report.) Jutius LiesLer, National Bureau of Standards. 


EK. J. Gumbel and B. F. Kimball have given estimators for the parameters of the asymp- 
totic distribution of largest values, Prob {X S z} = exp [—e~**~™], which have been 
applied in the analysis of data in large samples. The present paper applies the theory of 
order statistics to the problem of seeking more efficient estimators which are at the same 
time simpler to compute. Several large-sample estimators are found which, with one excep- 
tion, appear to have greater efficiency than those derived from the methods of Gumbel and 
Kimball, yet require much less effort in computation. If punch cards are used the work can 
by handled by a mechanical sorter which ranks the observations in order of size and then 
selects a small number of them with predetermined ranks. (This work was sponsored by 
the National Advisory Committee for Aeronautics.) 
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10. The Use of Previous Experience in Reaching Statistical Decisions. J. L. 
Hopcrs, Jr., University of California, Berkeley, Aanp E. L. LeHMann, 
Stanford University and University of California, Berkeley. 

Instead of minimizing the maximum risk it is proposed to restrict attention to decision 
procedures whose maximum risk does not exceed the minimax risk by more than a given 
amount. Subject to this restriction one may wish to minimize the average risk with respect 
to some guessed a priori distribution suggested by previous experience. It is shown how 
Wald’s minimax theory can be modified to yield analogous results concerning such restricted 
Bayes solutions. A number of examples are discussed. 


11. Results of Some Tests of Randomness on Pseudo-random Numbers. (Pre- 
liminary Report.) J. M. Cameron, National Bureau of Standards. 


A method for generating random numbers on automatic computing machines (such as 
the SEAC) has been developed by Dr. Olga Taussky-Todd. Using this method about 2@ 
pseudo-random numbers can be generated by taking residues (mod 2**) of successive powers 
of 5'7. The results of some tests of the randomness applied to these numbers are presented. 
The evidence from these tests is in agreement with the hypothesis of randomness. 


RR 


(Abstracts of papers presented at the Boston meeting of the Institute, December 26-29, 1951) 


12. Two Rank Order Tests Which Are Most Powerful against Specific Para- 
metric Alternatives. Miuron E. Terry, Jr., Virginia Polytechnic Institute. 


The most powerful rank order tests of the hypotheses that two samples come from the 
same population and that in each of k groups of two samples the two samples came from a 
common population are considered, and most powerful rank order tests against certain 
normal alternatives are derived. For the two tests of immediate practical importance, 
asymptotic, approximate, and (for certain small sample sizes) exact distributions are 
given. The relationship of these tests with others are investigated. 


13. Partially Balanced Designs with k > r = 3, \; = 1, Ax = O. R. C. Bose 
AND W. H. Ciatworrtuy, University of North Carolina. 


Incomplete block designs with a few replications are of practical importance to experi- 
menters. Partially balanced designs with k > r = 2 have been studied by one of the au- 
thors (R. C. Boss, ‘‘Partially balanced incomplete block designs with two associate classes 
involving only two replications,’’ Calcutta Stat. Assoc. Bull., Vol. 3 (1951), pp. 120-125). 
The present paper extends this investigation to the case k > r = 3,A, = 1, A: = O. It has 
been shown that only three types of partially balanced designs belong to this class, viz. 
(a) designs obtained by dualizing balanced incomplete block designs with k = 3, = 1; 
(b) lattice designs with r = 3; (c) partially balanced designs belonging to the series v = 
(t + 2) (2+ 3), b = 3(2t+3),r=3,k =t+2, nm, = 3(¢ + 1), nme = 2(t + 1), Pit = t. It 
has been shown that for the series (c) the only combinatorially possible design for which 
k > 3,is the design witht = 3. The casest = Oandt = 1, though not belonging to the class 
k > r, are combinatorially possible cases. Designs corresponding to all other values of ¢ 
have been shown to be impossible. 


14. Some Observations on the F’-Test in Analysis of Variance. S. N. Roy, Uni- 
versity of North Carolina. 


It is well known that analysis of variance deals with a class of problems which can be 
reduced to a problem of testing of a (composite) linear hypothesis (within a certain model) 
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and for which the test in common use is the F-test; and within the last few years several 
optimum properties of this test have been brought out by different workers. It was also 
noted in 1948 by Lehmann and Stein (E. LEHMANN AND C. Stern, ‘“‘Most powerful tests of 
composite hypotheses I. Normal distributions,’’ Annals of Math. Stat., Vol. 19 (1948), 
pp. 495-516) and the author (S. N. Roy, ‘‘Notes on testing of composite hypotheses—II,”’ 
Sankhyd, Vol. 9 (1948), pp. 19-38) that for this problem of testing of a (composite) linear 
hypothesis it is possible to construct an infinite class of similar region tests among which 
there is a most powerful test against a specific (composite) alternative (which differs from 
alternative to alternative). The corresponding statistic (whose structure depends upon the 
alternative) has the ¢ distribution, when the hypothesis to be tested is true; and the test 
itself could be properly called a generalized t-test, since the ordinary t-test could be shown 
to be a special case of it. This paper works out a connection between the ordinary F-test 
and this generalized t-test, showing that the F-test at a level of significance, say a, could be 
derived in a certain manner from the infinite class of (most powerful) t-tests of the hypothe- 
sis at a level (<a) against the infinite class of possible specific (composite) alternatives. 


15. The Neyman-Pearson Lemma Factor Functions. L. M. Court, American 
Power Jet Company, New York. 


According to the Neyman-Pearson lemma, recently proved necessary as well as suffi- 
cient by Dantzig and Wald (‘‘On the fundamental lemma of Neyman and Pearson,” 
Annals of Math. Stat., Vol. 22 (1951), pp. 87-93), the optimum critical region w of size a 
for testing po(z) = po(z1,--- , tn) against pi(z) is given by w: p(x) 2 kpo(xz), where k 
is a suitable constant (factor). There obviously is a relationship between the size a and the 
factor k, i.e., a = r(k) or k = r~'(a), where the functions r and r~! need not be as naive 
as the ones encountered in the elementary calculus. The writer establishes four properties 
of r(k) and r~'(a): (1) to any value of k there corresponds in general a closed interval of 
a-values (which may reduce to a point); (2) to any value of a, there corresponds in gen- 
eral an interval of k-values, open at the bottom and closed at the top; (3) r(k) is a non- 
increasing function of k in a generalized sense; and (4) r~'(a) is a nonincreasing function 
of @ in a generalized sense. These properties are extended, with some restrictions, to 
the functions aj = r;(ki , --- , km)(i = 1, --- , m) and their inverses, where k; , ++: , km 


are the factors resulting when a region w is sought maximizing [ tone dz subject to 
we 


[s@ dz = 1, --- , m), this region being characterised by fm4i(z) 2 kipi(z) + --- 


+ kmDPm(x)—the general Neyman-Pearson lemma. (Every f;(z) is assumed nonnegative 
throughout R, .) 


16. The Probability of a Correct Ranking. (Preliminary Report.) Roserr E. 
BrecuHorer, Columbia University. 


Let X;; be normally and independently distributed with mean y»; and unit variance 
(i = 1,2,--- ,k; 7 = 1, 2,--- , N). Let ull], wl2], --- , w[k] be the ordered y; ; let X(i) 
be the sample arithmetic mean associated with the population having mean ,fi](i = 1, 
2, --- , k). The w; are unknown; it is not known which population is associated with 
uli]. Specify positive integers ri, r2,--- , r(8 S k) such that Di_ir; = k. Define 
‘indifference regions” 5% as the smallest differences, u{t + 1] — ult], (@ = 1,2,---,k— 1), 
which it is desired to detect. Define the symbol S; = L7_,r;. We wish to determine 
Py (8,83, --* ,8¢_1) = Inf for 6, > 54 (i = 1, 2, --- , k — 1) of Prob [Max{X(S; + 1), --- 
X(Sj4:)} < Min{ X(S;,, + 1), --- , X(S;42)} simultaneously for j = 0,1, 2,---,8s — 2 
with ro = 0}. The required probability is expressed as a volume under a (k — 1)-variate 
normal surface. For given] 3(0 < 8 < 1), the desired N is the smallest one for which 


’ 
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Py (8), 63, °°: , _,) 2 1 — B. An analogue of power is defined. A method of making 
confidence statements is described. It is shown that many analysis of variance (Model 1) 
problems can be more meaningfully formulated as problems involving multiple ranking of 
means, and how experimental designs (randomized blocks, Latin squares, ete.) can be used 
to increase the probability of a correct ranking. A procedure similar to the one described 
above is applied to the ranking of variances when the population means are known or un- 
known, the required probabilities being expressed as volumes under (k — 1)-variate gener- 
alized F-distribution surfaces. Other directions of generalization are indicated. (Part of 
this work was done under the sponsorship of the ONR.) 


17. A Nonparametric Analogue Based upon Ranks of One-way Analysis of 
Variance. WiLLIAM Kruskat, University of Chicago. 


Given independent random variables ¢{")(¢ = 1, --- ,C;j = 1, +++ , mi; Ung = N) with 
continuous distribution functions Pr{g$*’? < xz} = F,(z), one may wish to test the null 
hypothesis: F; = F; = --- = Fe against alternatives of form F,(z) = F(z — 6;)(i = 1, 

- , C) with not all the 6’s equal. The following test (essentially proposed by Wallis) is 
discussed: replace the é’s by their ranks in the N-fold sample, xX; and compute H = 
(12/(N(N + Dis. 0 ni) (Ri — 4ni(N + 1))?, where R; = Zeavks’'3 reject if H is too 
large. This test is a generalization of the symmetrical two-tail version of the Wilcoxon- 
Mann-Whitney test, and is also equivalent to the use of the standard F-test for one-way 
analysis of variance after replacement of observations by ranks. H is shown to be asymp- 
totically chi-square with C—1 degrees of freedom. A condition for consistency is stated and 
given an intuitive interpretation; the translation alternatives mentioned above satisfy this 
condition, but so do many others. The variance of H under thenull hypothesisand the maxi- 
mum value of H are obtained explicitly and their use in approximating the distribution of // 
is suggested. The possibly discontinuous case is considered, and a method for handling ties 
proposed by Wallis is discussed. 


18. A Series of Group Divisible Designs for Two-way Elimination of Hetero- 
geneity. S. S. SHRIKHANDE, University of Kansas. 


From the affine resolvable design v = s?,b = s?+8s,r=8s+1,k = 8,\ = 1 omit a 
complete replication, and from the remaining blocks omit the treatments lying in any 
n(<s — 1) blocks of the omitted replication. We get a group divisible design with v = 
s (s n), b = 8%, r= 8s,k = s — n, where the v treatments can be divided into s — n 
groups of s each where any two treatments of the same group do not occur together in any 
block, whereas any two treatments coming from different groups occur together in just one 
block. This design can be used for two-way elimination of heterogeneity by suitably inte1 
changing the positions of treatments in the various blocks if necessary. All treatment com 
parisons are made with at most two accuracies. 


19. A Test of the Uniformity of a Circular Distribution. (Preliminary Report.) 
J. Artruurk GREENWOOD, Manhattan Life Insvrance Co., AND Davip 
DuRAND, National Bureau of Economie Research. 


Let X - , X, be a sample from an unknown distribution on the circumference of a 


circle. To test the hypothesis that the distribution is uniform, the use of the statistic A is 
proposed, where n2A2 = (2P_, cos X,;)? + (2P_, sin N;)?. The distribution of A is essen 
tially given by the solution of Pearson’s random walk problem (IX. Pearson, ‘‘Mathemati 


eal contributions to the theory of evolution. XV. A mathematical theory of random migra 
tion,’’ Drapers’ Co. Res. Mem. Biometric Series 3, Cambridge University Press, 1006) and 
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a 
is P|A < a} = [ [Jo(t)]" na J{nat) dt; this distribution is also obtained readily by the 
0 


use of characteristic functions. For the alternatives given by the cyclical distribution of 
von Mises (Uber die ‘“‘Ganzzahlbarkeit’? der Atomgewichte und verwandte Fragen,” 
Physik. Zeitschr., Vol. 19 (1918), pp. 490-500) the method of characteristic functions gives 
the power function in terms of similar but less tractable integrals. 


20. A Method for Limit Theorems in Markov Chains. T. E. Harris, The Rand 
Corporation. 


The following Markov process has been used by Mosteller in a psychological application. 
Let 2 , 2, °°: be the random variables of the process, 0 < 2; < 1. Let S(z) = oz and 
T(z) = a + (1 — a)z, 0 < a,o < 1, be two linear functions of z, and let f(z) = a + br 
be a third linear function such that 0 < f(z) < 1 when 0 < z < 1. The transition law is 
as follows. Suppose z, is given. Then z,,, = S(z,) with probability f(z,) and 2,4, = T(z,) 
with probability 1 — f(z,). It is shown that the distribution of z, approaches a limiting 
distribution independent of z» as n —~ «©. The method can be modified to give a proof of 
the “‘ergodic’’ theorem for Markov processes with discrete states and also, in the generali- 
zation of renewal theory by Chung and Pollard, to random variables with positive and nega- 


tive values, to remove the restriction to distributions with an absolut-ly continuous com- 
ponent. 


21. On Tests of Certain Hypotheses about Multivariate Normal Populations. 
S. N. Roy, University of North Carolina. 


Large classes of problems in multivariate analysis can be brought under one or other of 
the three problems of testing of the composite hypotheses of (i) equality of the dispersion 
matrices for two p-variate normal populations; (ii) equality of the means (of each separate 
variate) for K(>2) p-variate normal populations; (iii) independence between two sets of 
variates p; and p2 under a (p; + pz)-variate normal distribution. It is partly shown in 
previous papers by different workers and more fully shown here that, if we take over and 
carry through the idea behind discriminant analysis, we would get in each case a test which, 
for case (i), would be based on the largest and smallest roots, and, for cases (ii) and (iii), 
on the largest root of certain determinantal equations (different for the three cases). Such 
tests would have all the known desirable properties of other possible similar region tests 
(including the likelihood ratio tests) of the composite hypotheses concerned. The main 
purpose of the present paper, however, is to show that the three tests given here (for the 
three cases), at any level, say a, could each be derived in a certain manner from an infinite 
class of most powerful tests at another level 8(<a) against different possible (composite) 
alternatives. 


22. An Inequality for Orthogonal Arrays of Strength 2. S. 8. SurikHaNnpeE, Uni- 
versity of Kansas. 


A matrix A(a;;) with l rows and N columns where each element a;; is one of the n integers 
1, 2, --- , nis called an orthogonal array of strength 2 if for every pair (7; , i2) of two rows, 
the pairs (@;,; , @i2;),j = 1,2, --- , N contain each of the n possible pairs exactly wu times 
(N = yun?). The array is said to be of size N, 1 constraints and n levels. It is known that 
lnax < I((un? — 1)/(n — 1)), where J(z) is the largest integer contained in z. If n and t 
are integers (n > 2, t > 0), then with N = n*((m — 1)t + 1), where n is the number of 
levels, the above inequality gives Imax < n*t + n + 1. Using a result due to Plackett and 
Burman ‘‘‘The design of optimum multifactorial experiments,’’ Biometrika, Vol. 33 (1946), 
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pp. 305-325) it can be shown that Imax < n*t + n in the three cases of impossibility of 
affine resolvable balanced incomplete block designs announced at the thirteenth summer 
meeting of the Institute of Mathematical Statistics in September, 1951. 


23. The Distribution of the Range in Samples from a Compound Normal Popula- 
tion. KenNeTH H. Kramer, Youngstown Sheet and Tube Company. 


The distribution of the range, R, in samples of n observations from a population having 
the distribution function F(z) = p(x; m, 0) + qb(z; 0,1), where &(z; m, o) = 1/(4/ x20) 


z 
[ exp {—(t — m)*/(20*)} dt, 0 < p <1, is considered. This type of distribution pro- 


2 


vides a good model for many industrial processes, and the range is used extensively in in- 


x 
dustrial statistics. For n = 2, 3, the range distribution function G,(R) = nf (F(x + R) 
20 


— F(x)\"~' dF (2) is integrated, giving G:(R) in terms of integrals of the normal frequency 
function, and G;(R) in terms of integrals of the bivariate normal frequency function over 
rectangular regions. These expressions for G:(R) and G;(R) are then used to derive similar 
expressions for the mean range, R, and the standard deviation of the range, ¢z. To compute 
G,(R) for n > 3, an expression by Hartley, (E. S. Pearson, ‘“‘The probability integral of 
the range in samples of n observations from a normal population. I. Foreword and tables,’’ 
Biometrika, Vol. 32 (1942), pp. 301-308. H. O. Hart.ey, ‘“‘The probability integral of the 
range in samples of n observations from a normal population. II. Numerical evaluation of 
the probability integral,’’ Biometrika, Vol. 32 (1942), pp. 309-310. H. O. Harter, ‘‘The 
range in random samples,’’ Biometrika, Vol. 32 (1942), pp. 334-348, especially pp. 341, 342) 
is generalized. Tables given by Hastings, Mosteller, Tukey, and Winsor, (‘‘Low moments 
for small samples: a comparative study of order statistics,’’ Annals of Math. Stat., Vol. 
18 (1947), pp. 413-426) provide a basis for approximate formulas for R and ore (all n); 
these formulas are asymptotic in m, and in the case of R give a lower bound on the exact 
value. Tables and graphs of G,(R/o-) and R/s. , where a: = po? + q + pgm? is the var- 
iance of the compound normal! population, are given for p = 0, },4,m = 0,1,2,2,¢0 = 
1,2, AR = 1,n = 2,3, --- , 20. The tables are then used to construct power curves for the 
Shewhart control chart for ranges under various types of alternatives. 


24. On the Operating Characteristics of Certain Quality Control Tests. JoHNn 
E. Watsu, U. 8. Naval Ordnance Test Station, Pasadena. 


This paper presents values of the operating characteristic (OC) function for a common 
type of quality control test and for two possible substitutes for this test. The situation 
considered is that of small samples from a normal population. The common type of quality 
control test investigated is based on the sample mean and the sample standard deviation 
(using n — 1). One of the substitute tests is based on the t-statistic and the sample standard 
deviation. The other substitute test is based on the sample mean and the estimate of the 
population standard deviation obtained by using the mean of the population which deter- 
mined the control limits. Each of the three types of tests is found to have regions where its 
operating characteristics are poor. No one type of test has uniformly better operating char- 
acteristics than the others. For each type of test there exist regions where its operating 
characteristics are superior to those of the other two. On the whole, however, the tests 
based on the t-statistic and the sample standard deviation appear to be inferior to the other 
two types, which are roughly equivalent. An extensive OC function analysis is presented 
for the common type of quality control test. This analysis furnishes a fairly comprehensive 
picture of the operating characteristics for this kind of test. 
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25. Operating Characteristic of the Control Chart for Sample Means. (Prelim- 
inary Report.) EpGar P. Kino, Carnegie Institute of Technology. 


A study is made of the Type I and Type II errors of the control chart for sample means 
in the case where process standards are unspecified. Under the null hypothesis, the distribu- 
tion of the process is N (u, o?), where « and o are unknown constants. Under the alternative, 
the process mean is a random variable with a N(u, @0*) distribution (@ > 0). The two types 
of errors are tabulated for cases ranging from 2 samples of size 2 to 4 samples of size 10. 
Bounds on these errors are tabulated for cases ranging from 5 samples of size 2 to 25 samples 
of 10. The effect of altering the traditional ‘‘3-sigma”’ limits is investigated and the power 
is compared with that of the corresponding analysis of variance test. 


26. Joint Sampling Distribution of the Mean and Standard Deviation for Fre- 
quency Functions of the Second Kind. Metvin D. Sprincer, U. 8. Naval 
Ordnance, Indianapolis. 


The joint sampling distribution of # and s is derived for frequency functions of the 
second kind, i.e., for frequency functions defined on the interval (0, « ). The joint distribu- 
tion has the integral form 


F(z,8)= I ie [se (x2) +++ f(tn—2)-S(§lnF — VR-2x; — Q- |) FAlnF — Te-22; + 2-*)) 


2n*s /Q\"-?) dzn_2 --- dx2 dx, , where the limits of integration of z,_,(r = 2,3, ---,n — 1) 
are given in terms of #, s, and z,_,_;(i = 1,2,+-- ,n —r—1) and depend largely upon 
which of the intervals /;: (Vj/tn — jit, Vij + 1i/in — G+)}#),J =0,1,---,n—2, 
contains s. The limits of integration of z,_,(r = 2,3, --- ,n — 1) also involve Q‘*), m = 1, 
2,---,n—1,k=0,1,--- ,n— 2, where QS = [—m(m + 2)Zkzi — WmTt =e 1 4.7541 + 
2nmzL*kx; — mn(n — m — 1)%* + (m + 1)mns*}i. As an example, F(#, s) is evaluated 
when f(z) is a chi-square universe. Throughout this paper n represents sample size. 


27. Statistical Theory of Droughts. E. J. GumBe., Consultant, Stanford Uni- 
versity. 


The droughts, z, the annual minima of discharges, are analyzed by the asymptotic theory 
of smallest values of a positive statistical variate and, the extremal probability paper used 
up to now for the floods is used for the logarithms of the droughts. If the observed cumula- 
tive frequencies are scattered about a straight line and, in particular, not bent downward 
toward the end, the limiting value of the droughts may be assumed to be zero. In this case 
the scale parameter 1 /a and the location parameter u are estimated by the methods used for 
the floods. If the observed points approach a curve which is bent downward toward the end, 
the lower limit « exceeds zero, and the asymptotic probability function contains three 
parameters. They may be estimated by the method of moments which leads to Gamma 
functions depending only upon the scale parameter 1/a. The three parameters 1/a, u, ¢, 
are then obtained with the help of a table calculated by Gladys Garabedian. A statistical 
criterion is given which decides whether the lower limit may be assumed to be zero or not. 
The droughts of 13 rivers analyzed by this procedure show a very good fit between theory 
and observations. The theoretical curves can be used to estimate the most severe drought 
to be expected within a given number of years, provided that the basic conditions will pre- 
vail. This procedure may be important for solving problems arising for storage and irriga- 
tion. 


28. Some Tests Based on the First r Ordered Observations Drawn from an 
Exponential Distribution. BENJAMIN Epstein AND Mitton Soseit, Wayne 
University. 
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In this paper we study statistical problems which arise when the observations become 
available in an ordered manner. There exist many practical test situations, e.g., life testing, 
fatigue testing, and other kinds of destructive test situations where the data occur in order 
of magnitude (i.e., the weakest item fails first, the second weakest item fails next, etc.). 
It seems that in such cases we can if we choose, discontinue experimentation after the first 
r (r < n, the number of items tested) failures in a life test have occurred. Two principal 
advantages stem from the fact that the observations occur in an ordered manner. These 
are that we may be able to reach a decision in a shorter average time or with fewer observa- 
tions on the average or both than if we were to utilize a procedure which has the same risks 
of making wrong decisions but which involves taking all m out of m observations (and thus 
in effect disregards the basic fact that information is becoming available in an ordered 
manner). The problem is explored in some detail for the special case where the life X is 
a random variable whose probability law is given by the pdf f(z, 0) = e-7/*/@,2 > 0,6 > 0. 
A number of procedures meeting either one or both of the desirable objectives mentioned 
above are given in connection with various simple and composite tests on the parameter 86. 
(Research supported by the Office of Naval Research under Contract No. Nonr-451(00).) 


29. Some Theorems Relevant to Life Testing. Mitron SopeL AND BENJAMIN 
Epstein, Wayne University. 

A set S of N independent exponential random variables X;(i = 1, 2,--- , N) is con- 
sidered, the density of X; being f(z;) = e7 41—-99)/8 79 2. > a;, 06> 0, where @ is the com- 
mon unknown parameter to be estimated. One of the cases considered is a; = a(i = 1, 
2,--- , N), where a is known and positive. All possible ways are considered of breaking up 
the set S into subsets. A total of R observations are taken subject to the condition that 
within each subset the observations are ordered. For each of these ways the distribution of 
the maximum likelihood estimate 6 of @ is the same, namely a distribution of Type III. 
Hence they are all equivalent relative to any properties depending only on the distribution 
of 6, e.g. the variance of 6. A replacement procedure is also considered in which the experi- 
menter can only work with a maximum of one set of n(Q <n < N — R +1) random 
variables. After each observation he takes a new random variable to replace the item that 
failed. If R observations are taken the distribution of @ is again the same as above. Some 
results on the average time required are also obtained. (Research supported by the Office 
of Naval Research under Contract No. Nonr-451(00).) 


30. A Method of Reducing the Time Required to Complete Certain Fatigue 
Tests. Leonarp G. Jonson, General Motors Corporation, Detroit. 


If it is assumed that the form of a specimen’s life distribution is known, and that there 
is but one unknown parameter, the author shows that the distribution of the maximum 
likelihood estimate of the parameter based on the r first failures out of a sample of n is 
independent of n. As a result, it can be concluded that the testing time required to fail the 
first r specimens, r being fixed, can be made as small as desired simply by making n suffi- 
ciently large. 


é 
31. On the Multivariate Poisson Distribution. Henry TreicHer, Purdue Uni- 
versity. 


The joint distribution of correlated Poisson variables X, , X2 , --- , Xp may be derived 
from a multinomial population and involves a (2? — p — 1)-fold summation as well as 
2? — 1 parameters. Its ef is (t1,---, tp) = Cexp{2?_,aizi + Dicjaijzizj + +++ + aziz2--+ Zp}, 


where z; = e“i, the a; , ai; , aij ete. are nonnegative parameters and C is such that 
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¢(0,--- , 0) = 1. For p = 2, one has the known result Pr {|X = z, Y = y}*= Zy_, 
(41 — )** (we — pw)¥*y*/[(2 — k) (y — k) tk!) e-1+#2-"), where w = min (z, y), wi = E(X), 
we = E(Y),u = Cov (X, ¥). The cases p = 2 and p = 3 are considered in detail and the 
preceding distribution is generated from three simple postulates. The limiting distribu- 
tion and some properties are discussed. 


32. Formulas for Approximating the Hypergeometric and Binomial by the 
Poisson Distribution. Invinc W. Burr, Purdue University. 


Let a sample of n be drawn from a lot of m objects of which d are of one sort; let d/m 
be p and np be X. Then if the hypergeometric, binomial and Poisson probabilities of exactly 
z in the sample are respectively h(z; n,m, d), b(z; n, p), and p(z; ) we have approximately 
h(x; n, m, d) = b(z; n, p){l + (x — (a — d)*)/(2mp(1 — p))]; b(z; n, p) = p (2; ADI 
+ (x —(x — d)*)/(2n)]. The former is to within terms of the order 1/(m*p*) while the lat- 
ter is to within terms ofthe order of 1/n*. Since the second terms in the brackets are ap- 
proximate relative errors, they may be added in going from p(z; \) to A(z; n, m, d). Using 
the formulas as a correction to tabulated values of the Poisson distribution, we get excellent 
approximations to hypergeometric probabilities. 


33. Distributions of Ranges from an Arbitrary Discrete Population. Invinc W. 
Burr, Purdue University. 


The exact sampling distribution for the range, R, for small samples from any discrete 
population (with finite range) may be obtained from formulas involving combinations of 
sums of nth powers of certain sums of consecutive probabilities. The calculation is not at 
all prohibitive for samples of five or less if probabilities are taken to the nearest .01 or .005. 


34. Sufficient Statistics and Selection Depending on the Parameter. D. A. 8S. 
Fraser, University of Toronto. 


For a class of density functions with respect to a fixed measure, ‘functional sufficiency”’ 
or ‘‘f-sufficiency”’ is defined by the density factorization usually associated with sufficiency. 
Conditions are immediately available under which sufficiency and f-sufficiency are equiva- 
lent. A minimal f-sufficient statistic is defined and proved to be essentially unique; its 
construction is given. The minimal f-sufficient statistic is shown to be equivalent to the 
combination of a ‘“‘statistie of selection”? and the minimal f-sufficient statistic for a class 
of densities for which the region of positive density is fixed. Subject to mild continuity 
conditions, sufficient statistics in this latter case have been treated by B. O. Koopman. If 
the parameter is a parameter of selection from a fixed distribution, then the statistic of 
selection is the minimal f-sufficient statistic. If in addition the regions of positive density 
are monotone and are indexed monotonely by a real parameter, then the statistic of selec- 
tion is sufficient according to the Halmos and Savage definition. 


35. On a Problem Suggested by Blackwell. (Preliminary Report.) CHARLES 
Srern, University of Chicago. 


If M, M’ are probability measures concentrated on a finite set of points in n-space, we 


say that M > M’ if for every convex function ¢, [> dM > [e dM’, and we say that M > M’ 


if there exists a stochastic matrix (p;;) (one with p;; > 0, 2 jpi; = 1) such thatA; = Daipi; , 
\jy; = Ddyipi; , where M, M’ are concentrated on y; , yj respectively and M(y:) = A; , 
M'(y;) = Xj . It is obvious that M > M’ implies M > M’ and here the converse is proved. 
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The following lemma is used. If ¢ is a convex function defined on the convex set [yi , --- , 
yx) generated by the points y; , --- , yx in a real vector space, then there exists a convex 
function ¢; on [y;,--- , yx} such that gd: > ¢, o:(yi) = o(y:) for i = 1--- k, and there 
exists a decomposition of [y, , --- , yx] into simplexes with vertices among y; , --- , ys such 
that ¢: is linear on each of these simplexes. It is proved that the set of all M” such that 
M > M” > M’ ordered by > has a maximal point. It is proved that any maximal M” must 
be concentrated on the set of points to which M assigns positive probability. The proof is 
completed by the additivity of >. The significance of this result for statistical decision 
theory is explained by Blackwell (‘Comparison of experiments,’’ Proceedings of the Second 
Berkeley Symposium on Mathematical Statistics and Probability, University of California 
Press, 1951, pp.’ 93-102). 


36. On a Class of Infinitely Divisible Distributions. GorinatH KALLIANPUR, 
University of California, Berkeley, anv HerBert Ropsrns, University of 
North Carolina. 


Let (x) be a non-lattice df having a finite moment of order 2 + 6(6 > 0) and let ¢(t) 
be its characteristic function (cf). Set ¢’(0) = ia and @”(0) = —b?. A study is made of log- 
arithms of cf’s (lef) which are given by one of the representations, 


« 


(A) g(t) = iyt + [ [o(tu) — 1 — iatu/(1 + u?))[1 + u?)/v?] dG(u), 


(B) g(t) = iyt + [ [o(tu) — 1 — éatul{1/u?] dG(u), 


where y is a real constant and G(u) is a nondecreasing function of B.V. with G(—«<) = 0. 
The following results are proved. (1) The limit of a sequence of Icf’s of type (A) is also the 
lef of type (A). (2) The formula (A) uniquely determines y and G(u). (3) A necessary and 
sufficient condition for a sequence g,(t), (n = 1, 2, ---) of lef’s of type (A) to converge to 
the lef g(t) is that asn — ~,y, > y and G,(u) — G(u) at points of continuity of the latter. 
Analogous results for lcf’s of type (B) are stated. Lef’s of types (A) and (B) occur naturally 
in connection with the investigation of limiting distributions (as \ — ~) of r.v.’s of the 


6 
form © f R,(u) dX(u) — A, , where (i) a, 8 are finite, (ii) R,(u) is for \ > 0 a continu- 
a 


ous, monotone function in (a, 8), (iii) A, is a constant depending on \ only, and (iv) X(u) 
is a generalized Poisson process, with \ as the Poisson parameter. Using the above results, 
it is proved that the class of limit laws of (*) is the class of distributions whose Icf’s have 
the representation (A). 


37. Almost Sure Estimability of Linear Structures in n Dimensions. T. A. JEEVEs, 
University of California, Berkeley. 


Let B denote an n X n matrix of rank r of sure numbers. Let X and U be independent 
unobservable n-dimensional random row vectors, and Y be an observable n-dimensional 
random row vector, such that Y = XB + U. Assume that U has a multinormal distribution 
and that B is identifiable. The problem considered is to use a sequence of N independent 
observations of Y to construct an almost sure estimate Sy of S, the space spanned by the 


row vectors of B. Let Ay = max min | ay — a| for all unit vectors ay in Sy, and all 
an a 


unit vectors a in S. Then Sy is said to be an almost sure estimate of S if the probability 
of Ay converging to zero is unity. The estimate Sy defined below generalizes the procedure 
of Neyman for the case of two dimensions. Let Zy be the N X n matrix of N observations 
on Y. A function Gy(Zy , K) of Zy and the row vector K is obtained which converges with 
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probability one to G(K), a function which is zero if and only if K is perpendicular to S. 
Let Ky; be the vector which minimizes Gy(Zy , K), and let Ky; (for i = 2,3, --- ,n — r) 
be the vector which minimizes Gy(Zy , K), subject to the restraint that Ky; is perpendicular 
to Ky; for j = 1, 2,--- , i — 1. Sy is the space orthogonal to the vectors Ky; for i = 1, 
2,¢°',n—T. 


38. Completeness of the Class of Admissible Decision Procedures. HERMAN 
RvuBIn, Stanford University. 


Let a space Q of distributions and a space ® of actions be given, and for each action @ 
and distribution @ let a risk (¢, @) be given. For every sequence ¢; such that for all 0, @, 
6) > (¢:-1 , 6) for all 7, let there be a y such that (¢; , 0) > (y, 6) for all z. Then if (¢, 6) is 
continuous in @ for each ¢ in some topology in which @ has a countable dense set, or if (@, 
6) is lower semicontinuous in some topology in which every subset of @ has a countable 
dense subset, then the admissible elements of # form a complete class. An example of the 
applicability of this theorem is as follows. Let a sequential decision problem be given with 
a finite set of terminal actions. Then if the density of the first n observations is continuous 
for each n in some topology in which every subset of 2 has a countable dense subset (in 
particular, if each observation can have only countably many values), the admissible pro- 
cedures form a complete class. 


39. Moment-Problem Solutions with Continuous Derivatives. (Preliminary Re- 
port.) Lionet Weiss, University of Virginia. 


Given a finite sequence of moments wo , 4: , *-- ,un 80 that there is at least one cumulative 
distribution function on a given interval [a, b] with these moments, conditions are known 
under which one can find an infinite number of cumulative distribution functions over the 
given interval with these moments. In these cases, further conditions are given so that at 
least one of the functions shall have a derivative whose square is integrable, and that func- 


b 
tion (say F(z)) is sought whose derivative f(z) has the property that / f?(z) dz is as small 
a 


as possible. The solution is essentially unique, and f(z) is continuous and is equal to a poly- 
nomial of at most the nth degree wherever it is greater than zero. The results can be ex- 
tended in various directions. 


40. Significance Consistency of the Basic Neyman-Pearson Test. L. M. Court, 
American Power Jet Company, New York. 


In a recent note (‘‘A property of some tests of composite hypotheses,’’ Annals of Math. 
Stat., Vol. 22 (1951), pp. 475-476) C. Stein pointed out that for most of the common tests, 
a result that is significant at the 1% level is significant at the 5% level. Having said this, 
he uses the fundamental Neyman-Pearson lemma to construct an example for which this 
is not so. (A similar example for Neyman-Pearson Type A regions is given by Chernoff, 
‘A property of some Type A regions,’’ Annals of Math. Stat., Vol. 22 (1951), pp. 472-474.) 
In Stein’s example, a composite hypothesis (2 elements) is tested against a simple alterna- 
tive, all three distributions being entirely discrete. The writer shows that it is essentially 
the discreteness that is responsible for this undesirable behaviour. He goes on to show that 
when the hypothesis and alternative are both simple and absolutely continuous distribu- 
tions (i.e., density functions exist everywhere) and the Neyman-Pearson lemma is used to 
determine critical regions, this phenomenon cannot arise. If the hypothesis is composite 
but a Bayesian approach (Wald) is possible, i.e, there is a least favorable distribution for 
the parameter over the range specified by the hypothesis in the Lehmann-Stein sense, this 
conclusion can be extended to it. 


Ee SEER STEN ere eer ne eT eR eine Serene en 
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41. On Sets of Parameter Points where It Is Possible to Achieve Superefficiency 
of Estimates. Lucien L. Lecam, University of California, Berkeley. 


Let X be a random variable with probability density f(z | 6) depending on a parameter 
6 ¢ 2; 2 being a measurable set of points on the real line. Let X¥‘") = (X, , X2,--- , Xn) 
be a sample of n independent observations on X. A sequence {7',(X")} of measurable func- 
tions is called a consistent asymptotically normal (c.a.n.) estimate of 6, with asymptotic 
variance {o2(0)|, if forevery @ ¢ Q, and for every ¢, lim,..p{(7T,[X‘"] — 0)/(o,(0)) <t| @} = 


t 
1/(+/2x) / e~ 42° dz. Assume Cramér’s regularity conditions which imply consistency and 
L.00 


asymptotic normality of the maximum likelihood estimate of 6 (Mathematical Methods of 
Statistics, Princeton University Press, 1946, p. 500). Let {a2(@)} be the asymptotic variance 
of the M.L. estimate. As n — «, let 8(@) = lim sup [o,(@)/a,(@)} and y(@) = lim [o,(6) / 
«,(6)| if this limit exists. An estimate {7,,[X‘™]} is called superefficient on S CQ if it is 
c.a.n. and if 8(@) S 1, for @« Q and 6(6) < 1 for @ « S. This set S is called the set of super- 
efficiency. J. L. Hodges produced examples of superefficient estimates. His method of con- 
struction will be denoted by (H). THeorem 1. Whatever «,0 S « < 1 and whatever the closed 
and reducible set So C Q, it is possible to construct superefficient estimates of @ with B(@) S e 
on So . The method of construction is (H). Tazorem 2. The set S of superefficiency has Le- 
besgue measure zero. THEOREM 3. If y(6) exists for all 6 ¢ S then, whatever be «,0 S ¢€ < 1, 
the subset of S where y(@) S €is closed and nondense. THEOREM 4. Whatever the denumerable 
set S CQ, it ts possible to construct {T,{X™]} c.a.n. on Q — S, with asymptotic variance 
{a2(0)} and such that for every 6 € S, the limit law of [T, — @|/a,(@), asn — ~, is more 
concentrated than the corresponding law of the M.L. estimates 


42. Relative Precision of Least Squares and Maximum Likelihood Estimates 
of Regression Coefficients. JosepH Berkson, Mayo Clinic. 


Three ‘‘estimators”’ of the parameters a and 8 of the logistic function P; = 1/(1 + *Bzi)) 
as used in bioassay were compared for three equally-spaced values of the dose x,, 10 u. each 
dose: (1) maximum likelihood, (2) minimum (Pearson classic) x?, (3) minimum logit x;, the 
first two requiring iterative procedures for evaluation, the last obtainable directly. With 
central dose at the L.D. 50, the three estimates are unbiased; the variance is smallest for 
the minimum logit x?, next larger for the minimum ,?, and largest for the maximum likeli- 
hood estimate. For dosage arrangements not symmetrical around the L.D. 50, the three 

stimates are biased, the maximum likelihood estimate positively, the x? estimates nega- 
ively; the mean square error is smallest for the minimum logit x?, next larger for the 
minimum x?, and largest for the maximum likelihood estimate. For all dose arrangements, 
the mean square error of the maximum likelihood estimate is larger than 1/7, those of the 
x? estimates are less than 1/7, in accordance with the Cramér inequality for the mean square 
error. Each of the estimators is sufficient. 


ar 


NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest. 
Personal Items 
Dr. Kurt W. Back, formerly at Washington, D. C., is now acting as Social 


Research Analyst on the Air Force Contract for the Bureau of Applied Social 
Research, a Columbia University project. 
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Professor Z. W. Birnbaum is on leave of absence from the University of Wash- 
ington and is Visiting Professor of Statistics at Stanford University for the aca- 
demic year 1951-52. 

Dr. Edward P. Coleman, formerly at Columbia University, is a Mathematical 
Statistician for Hughes Aircraft Company, Culver City, California, and is teach- 
ing engineering statistics at the University of California at Los Angeles. 

Dr. William 8. Connor, Jr., who has been doing graduate work at the Univer- 
sity of North Carolina at Chapel Hill, has joined theStatistical Engineering Labo- 
ratory of the National Bureau of Standards, Washington, D. C. 

Dr. John T. Dailey has left the Human Resources Research Center, Lackland 
Air Force Base, San Antonio, Texas, to replace Dr. Brundage as Chief of Classifi- 
cation and Survey Research, Bureau of Naval Personnel, Washington 25, D. C. 

Mr. John F. Hofmann, formerly with the Statistical Laboratory, Iowa State 
College, has taken a position as Mathematical Statistician at Dugway Proving 
Ground, Tooele, Utah. 

Mr. Elvin A. Hoy, formerly Head (Section A), Applied Sciences and Mathe- 
matics, Navy Training Publications Center, U.S. Navy, will serve as Analytical 
Statistician in the Industrial Factors Section, Planning Research Branch, Pro- 
gram Review and Analysis Division, Office of the Comptroller of the Army. 

Dr. Eric R. Immel has accepted an instructorship in the Department of Mathe- 
matics, University of Wisconsin. 


Dr. H. P. Mulholland has accepted an appointment as Lecturer in Pure 


Mathematics, University of Birmingham, England. 

Dr. M. G. Neurdenburg has left the statistical division of the Municipal 
Medical and Public Health Department of Amsterdam and has been appointed 
Medical Officer of Public Health in the Netherlands State Public Health De- 
partment. His duties will be to start the Cancer Registration in the Netherlands, 
and to act as Head of the newly created Division of Public Health Statistics in 
the same Department. 

Mr. Don C. Price has accepted a position in the Institute for Cooperative Re- 
search, the Johns Hopkins University, Baltimore, Maryland. 

Dr. P. Ratoosh has resigned his position as Instructor at the University of 
Wisconsin and has been appointed Assistant Professor in the Department of 
Psychology at Ohio State University. 

Mr. David Rubinstein, who has been a Research and Teaching Assistant at 
the University of California, has been appointed as Supervisory Survey Statisti- 
cian in the Department of Biometrics at the USAF School of Aviation Medicine, 
Randolph Field, Texas. 

Miss Marion Sandomire, formerly employed in the Bureau of the Census, 
Washington, D. C., has accepted a position as Statistician, Production Division, 
New York Operations Office of the Atomic Energy Commission. 

Dr. S. S. Shrikhande, who is on leave of absence from the College of Science, 
Nagpur, India, has been appointed Visiting Assistant Professor of Mathematical 
Statistics at the University of Kansas, Lawrence, for the academic years 1951 
to 1953. 
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Mr. Waldo A. Vezeau has been promoted to Associate Professor of Mathe- 
matics at St. Louis University. 

Mr. Harry Weingarten has accepted a position with the Quality Control Divi- 
sion, Bureau of Ordnance, Navy Department, Washington, D. C. 

Dr. R. K. Zeigler, formerly Associate Professor at Bradley University, Peoria, 
Illinois, has accepted a position as Statistician with the Atomic Energy Com- 
mission, Oak Ridge, Tennessee. 

Dr. P. V. Sukhatme, Chief of the Statistics Branch, Food and Agriculture 
Organization of the United Nations, will be Visiting Professor of Statistics at 
Iowa State College during the Spring Quarter, beginning March 27, 1952; he 
will give lectures in advanced survey sampling. 


RR 


A copy of the first issue of Veréffentlichungen des Deutschen Aktuarvereins 
has been received. The publication includes actuarial, probability and statistical 
papers related to insurance. 


In January 1950 there was created in the Superior Council of Scientific Re- 
search of Spain the Department of Statistics (Calle Serrano, 123, Madrid), whose 
Director is Professor Sixto Rios, Professor of Mathematical Statistics at the 
University of Madrid. The purpose of this department is scientific research in 
the various branches of statistics and its applications. It publishes every four 
months the review T'rabajos de Estadtstica, and in this department seminars have 
been conducted by Professors Fréchet, Cramér, Wold, and Mahalanobis. Pro- 
fessor Herman Wold, Director of the Institute of Statistics at the University of 
Uppsala, conducted during November 1951 a series of seminars on statistical 
inference and stochastic processes. 


Biostatistics Conference, June 16—July 23, 1952, Iowa State College, Ames, Iowa 


A biostatistics conference has been scheduled for June 16 to July 23, 1952, at 
Iowa State College, Ames, Iowa. It is sponsored by faculty members working in 
agriculture, biology, and statistics at Iowa State College and by the Biometric 
Society (ENAR). The plan of the program is that each morning a biologist will 
present a problem, outline the objectives, describe techniques suitable for the 
experiment and analysis. A paired statistician will discuss suitable experimental 
designs and statistical and mathematical methods for attacking the problem. 
These speakers will preside at a general discussion period of the same topic the 
same afternoon. 

The program is tentatively arranged in five somewhat separate weekly units 
as follows: 

First week: Development of Quantitative Biology 

Second week: Specification of Populations and Their Processes 

Third week: The Estimation of Populations 

Fourth week: Individual Growth 

Fifth week: Biomathematical Mechanisms Within the Individual and Species. 
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Invited speakers include: Edgar Anderson, Geoffrey Beall, Joseph Berkson, 
Chester I. Bliss, P. W. Bridgman, Samuel Brody, C. West Churchman, William 
G. Cochran, C. C. Cockerham, Jerome Cornfield, Charles W. Cotterman, James 
F. Crow, 8. L. Crump, D. B. DeLury, Gordon E. Dickerson, Harold F. Down, 
A. M. Dutton, W. T. Federer, Robert P. Gage, C. B. Godbey, A. A. Hasel, P. G. 
Homeyer, John W. Hopkins, H. Hotelling, 8. Isaacson, O. Kempthorne, H. H. 
Kramer, Warren Leonard, Howard Levene, H. L. Lucas, 8. E. Luria, J. L. Lush, 
William G. Madow, Walter M. Meyer, Lloyd Miller, H. J. Muller, H. C. Murphy, 
Jerzy Neyman, H. W. Norton, Thomas Park, N. Rashevsky, A. 8. Rosenblueth, 
L. W. Seattergood, Franz Schrader, John P. Scott, G. W. Snedecor, G. W. 
Stewart, P. C. Tang, D. J. Thompson, John W. Tukey, A. S. Wiener, Norbert 
Wiener, Frank Wilcoxon, Edwin B. Wilson, Sewall Wright, M. R. Zelle, J. A. 
Zoellner. 

Rooms will be available in the college dormitories at the usual rates. For more 
detailed information write: T. A. Bancroft, Director, Statistical Laboratory, 
Iowa State College, Ames, Iowa. 


Summer Sessions in Berkeley, California 


This year’s summer program at the Statistical Laboratory of the University 
of California, Berkeley, California, consists of two sessions, June 23—August 2 
and August 4-September 13. The program includes 2 of the usual undergraduate 
courses, one in each session, as well as one new course in each session. In the first 
session the new course being offered is called: ‘‘Statistical methods of searching 
for causal relationships.”’ The course is designed to acquaint the students with 
statistical methods of approaching practical problems with particular reference 
to correlation and causality and to the pitfalls which studies of this kind fre- 
quently involve. 

The faculty of the summer sessions will include Dr. Grace E. Bates, Assistant 
Professor, Department of Mathematics, Mt. Holyoke College, South Hadley, 
Mass.; Professor J. Neyman, Dr. E. Fix, Dr. G. Kallianpur, Mr. L. LeCam, of 
the Statistical Laboratory, University of California. Professor J. Neyman will 
be available for consultations on work leading to higher degrees. 


CR 


New Members 
The following pérsons have been elected to membership in the Institute 
(August 23, 1951 to December 1, 1951) 
Barberi, Benedetto, Ph.D., Director General of the Central Institute of Statistics, Via 


Balbo 16, Rome, Italy. 

Bates, Grace E, Ph.D. (Univ. of Ill.), Assistant Professor of Mathematics, Mount Hol- 
yoke College, South Hadley, Massachusetts. 

Bignardi, Francesco, Teacher of Social Statistics, University of Palerme, Viale Albertazzi 
16 IIT, Bologna, Italy. 

Brenna, Svein, M.A. (Univ. of Oslo), Consultant, Central Bureau of Statistics, Ole Moe’s 
vet 35, Nordstrandshogda, Norway. 
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Breny, Henri, Ph.D. (Univ. of Liége), Lecturer at the Ecole Superieure des Textiles, 
Verviers, 528, Rue Haveigne, Fraipont, pce le Litge, Belgium. 

Brownlee, Kenneth A., M.A. (Cambridge), Chief, Test Design Branch, Plans and Evalua- 
tion Office, Technical Operations, Dugway Proving Ground, Tooele, Utah. 

Cox, Edwin L., M.S. (Virginia Polytech. Inst.), Mathematical Statistician, Plans and 
Evaluation Office, Technical Operations, Dugway Proving Ground, Tooele, Utah. 

Garner, Norman R., M.S. (North Carolina State Coll.), Analytical Statistician, Apt. 4-C, 
Woodlawn Court Apartments, Aldan, Del. County, Pennsylvania. 

Garrett, John E., B.S. (A&M College of Texas), Process Engineer, Film Research and 
Development Department, Olin Industries, Inc., P.O. Drawer 906, New Haven, 
Connecticut. 

Grings, William W., Ph.D. (Univ. of Iowa), Associate Professor of Psychology, University 
of Southern California, 8020 Agnew Avenue, Los Angeles 45, California. 

Homeyer, Paul G., M.S. (A&M College of Texas), Professor of Statistics, Statistical 
Laboratory, Iowa State College, Ames, Iowa. 

Luders, Johann D., Ph.D. (Univ. of Bonn), Chief, Statistical Methods Section, Regional 
Statistical Office of North Rhine-Westphalia, Statistisches Landesamt, Haroldstrasse 
37, Diisseldorf, Germany. 

Malmquist, Sten, Fil. Dr. (Univ. of Uppsala), Docent, Institute of Statistics, University 
of Uppsala, Uppsala, Sweden. 

Moan, Obert B., M.S. (Univ. of Minnesota), Quality Control Analyst, Julius Hyman and 
Company, Denver 1, Colorado. 

Powell, James H., M.A. (Mich. State College), Graduate Assistant, 904-C Maple Lane, 
East Lansing, Michigan. 

Rosenberry, L. Porter, B.S. (Pa. State College), Mathematical Physicist, Mathematical 
and Theoretical Physics Section, Explosives and Physical Sciences Division, Bureau 
of Mines, 4800 Forbes Street, Pittsburgh 13, Pennsylvania. 

Rosenblatt, Murray, Ph.D. (Cornell Univ.), Research Associate and Assistant Professor, 
Committee on Statistics, Eckhart Hall, University of Chicago, Chicago, Illinois. 

Shaw, Donald J., B.A. (Hope College, Holland, Mich.), Quality Control Engineer, E. I. 
duPont de Nemours & Co., Inc., 230 Isle Avenue, Waynesboro, Virginia. 

Singleton, Marcus G., M.S. (Univ. of North Carolina), Member of Engineering Staff, 
North American Aviation, Inc., 1455 West 97th Street, Los Angeles 47, California. 

Stange, Kurt, Ph.D. (Géttingen), Dozent for applied mathematics and mechanics and 
acting for Director, Institute fiir Mathematik und ihre technischen Anwendungen, 
Technische Hochschule (17a), Karlsruhe, Germany. 

Terry, Milton E., Ph.D. (Univ. of North Carolina), Associate Professor of Statistics, 
Virginia Polytechnic Institute, Blacksburg, Virginia. 

Venkatasubbiah, Gargeswari, B.T. (Mysore Univ.), Consulting Actuary and Professor of 
Statistics and Actuarial Science, B.M. College of Commerce, 894 Amrai Camp, 
Poona 4, India. 

Yamauti, Ziro, Ph.D. (Tokyo Imperial Univ.), Professor, Faculty of Engineering, Uni- 
versity of Tokyo, Jogi-cho, 2-chome, No. 1, Suginami-ku, Tokyo, Japan. 
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REPORT OF THE WASHINGTON MEETING OF THE 
INSTITUTE 


The forty-ninth meeting of the Institute of Mathematical Statistics was held 
in Washington, D. C., at the National Bureau of Standards on October 26-27, 
1951, in conjunction with the October, 1951, meetings of the American Mathe- 
matical Society and of the Washington Section of the American Society for 
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Quality Control. These meetings formed a part of the Semicentennial celebra- 
tion of the National Bureau of Standards. Three hundred fifty-one persons 
registered, including the following members of the Institute: 


Paul H. Anderson, T. W. Anderson, Ishver 8S. Bangdiwala, Charles A. Bicking, David 
Blackwell, John B. Boddie, R. C. Bose, A. H. Bowker, R. A. Bradley, Irwin J. Bross, R. 
8. Burington, Glenn L. Burrows, Joseph M. Cameron, Burton H. Camp, William 8. Connor, 
Jr., Edward L. Corton, Jr., G. F. Cramer, John H. Curtiss, Joseph F. Daly, George B. 
Dantzig, Donald A. Darling, Besse B. Day, R. DeLancie, W. Edwards Deming, P. Desind, 
Irene L. Doto, David Duncan, A. J. Duncan, David Durand, M. Dwass, Howard Edelson, 
Churchill Eisenhart, Henry Ellner, Jerome Engel, Benjamin Epstein, J. Kampé de Fériet, 
Clarence B. Fine, Kathryn Froelich, Edward Fritz, H. H. Germond, B. C. Getchell, Doro- 
thy Morrow Gilford, Leon Gilford, Samuel W. Greenhouse, Joseph A. Greenwood, Robert 
E. Greenwood, Alan A. Grometstein, E. J. Gumbel, Max Halperin, James F. Hannan, 
Harry H. Harman, Boyd Harshbarger, Herman Hess, Elvin A. Hoy, Charles H. Hubbell, 
George J. Hurwitz, Walter Jennings, Abraham E. Karp, Morton Kupperman, B. M. Kurk- 
jian, Jack Laderman, Otis E. Lancaster, Gilbert Lieberman, Marguerite Lehr, E. Leh- 
mann, Julius Lieblein, Jacob E. Lieberman, Eugene Lukacs, Mary D. Lum, Eli 8. Marks, 
Arthur S. Marthens, Clifford J. Maloney, John Mandel, Nathan Mantel, Frances A. 
Marans, Robert Mirsky, Hugh J. Miser, Sigeiti Moriguti, L. E. Moses, Charles M. Mottley, 
Mary G. Natrella, Monroe L. Norden, Toby Oxtoby, William R. Pabst, Jr., George W. 
Petrie, III, Harry Press, Don C. Price, F. R. DelPriore, Frank Proschan, Carl J. Rees, 
Mina Rees, M. H. Replogle, J. N. Rice, Paul R. Rider, Joan R. Rosenblatt, Ernest Rubin, 
Rose Sachs, Marion Sandomire, I. Richard Savage, Henry Scheffé, S. A. Schmitt, Marvin 
A. Schneiderman, Monroe G. Sirken, Rosedith Sitgreaves, John H. Smith, R. T. Smith, 
A. D. Solem, Herbert Solomon, D. Teichroew, Benjamin J. Tepping, M. E. Terry, Reuben 
Tucker, A. W. Tucker, John Tukey, George W. Tyler, F. M. Wadley, Frank M. Weida, 
Sidney Weiner, Harry Weingarten, Walter Whelan, Frank Wilcoxon, Max Woodbury, W. 
J. Youden. 


Dr. Eugene Lukacs, National Bureau of Standards, presided at the opening 
session on Friday morning, at which the following papers were contributed: 


1. On the Law of Propagation of Error. Preliminary Report. Churchill Eisenhart and I. 
Richard Savage, Statistical Engineering Laboratory, National Bureau of Standards. 
Multivariate Orthogonal Polynomials. Preliminary Report. D. B. Duncan and L. W 
Cooper, Virginia Polytechnic Institute. 

An Analysis of Variance for Paired Comparisons. Henry Scheffé, Columbia University. 
. Statistical Theory of Fatigue Failures. E. J. Gumbel, Consultant, Stanford Univer- 

sity, and A. M. Freudenthal, Columbia University. 

Analysis of Chain Block Designs. Preliminary Report. W..S. Connor, Jr., and W. J. 

Youden, Statistical Engineering Laboratory, National Bureau of Standards. 


This was followed by a symposium on The Planning of Experiments at which 
Miss Besse B. Day, U. 8. Naval Engineering Experimental Station, presided. 
The speakers and their subjects were: 


1. The Control and Measurement of Experimental Error. W. J. Youden, National Bureau 
of Standards. 
2. Recent Developments in Incomplete Block Designs. R. C. Bose, University of North 


Carolina. 
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Dr. Benjamin J. Tepping, Bureau of the Census, acted as chairman of the 
first Friday afternoon session at which time the following invited address was 
given: 


The ONR Program in Probability and Mathematical Statistics. Herbert Solomon, Office of 
Naval Research. 


The second session of Friday afternoon was a joint meeting with the Wash- 
ington Section of the American Society for Quality Control, with its chairman, 
Mr. Leon Gilford, Bureau of the Census, presiding. The invited address was: 


Recent Developments in Acceptance-Sampling Theory and Practice. A. H. Bowker, Stan- 
ford University. 


Dr. Joseph F. Daly, Bureau of the Census, presided at the opening session of 
Saturday morning, at which the following papers were contributed: 


1. An Approrimation Theorem. Preliminary Report. I. Richard Savage, Statistical En- 
gineering Laboratory, National Bureau of Standards. 

2. Testing Multiparameter Hypotheses. E. L. Lehmann, Stanford University and Uni- 
versity of California, Berkeley. 

3. Analysis of a Certain Random Walk by the Monte Carlo Method. Robert Mirsky, Cornell 
Aeronautical Laboratory, Inc., Buffalo, N. Y. 

. On Certain Estimators Based on Large Samples of Extremes. Preliminary Report. Julius 
Lieblein, Statistical Engineering Laboratory, National Bureau of Standards. 

. The Use of Previous Experience in Reaching Statistical Decisions. (By title.) J. L. 
Hodges, Jr., University of California, Berkeley, and E. L. Lehmann, Stanford Uni- 
versity and University of California, Berkeley. 

. Results of Some Tests of Randomness on Pseudo-Random Numbers. Preliminary Report. 
J. M. Cameron, Statistical Engineering Laboratory, National Bureau of Standards: 


This was followed by a symposium on Statistical Inference presided over by 
Professor John W. Tukey, Princeton University. The two speakers and their 
subjects were: 


1. On the Varieties of Statistical Reasoning. George E. Kimball, Columbia University. 


2. Statistical Decision Theory in Relation to Scientific Experimentation. David Blackwell, 


Howard University. 


First on the Saturday afternoon program was the NBS Semicentennial Ses- 
sion with Dr. John H. Curtiss, Chief of the National Applied Mathematics 
Laboratories, acting as chairman. Dr. A. V. Astin, Acting Director of the Na- 
tional Bureau of Standards, gave the address of welcome. Subsequently, Dr. T. 
W. Anderson, on behalf of the Institute, presented to Dr. Astin a scroll to the 
National Bureau of Standards in commemoration of its 50 years of service in 
science. 

Special events of the meeting included tours and demonstrations of various 
scientific activities at the National Bureau of Standards. On Friday those in- 
terested were taken on guided tours of selected NBS Laboratories and observed 
exhibits and demonstrations of clinical thermometer testing, standards of length 
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and mass, tire and leather testing, radioactivity standards, spectrochemistry, 
and chemical thermometric standards. On Saturday a demonstration of the 
National Bureau of Standards Eastern Automatic Computer—the SEAC—was 
held. The generation and testing of random numbers and a solution of Laplace’s 
equation by the Monte Carlo method were demonstrated. 
CuurRcHILL EIsENHART 
Assistant Secrelary 


ed 


REPORT OF THE BOSTON MEETING OF THE INSTITUTE 


The fourteenth Annual Meeting of the Institute of Mathematical Statistics 
was held in Boston, Massachusetts on December 26-29, 1951. Headquarters 
were at the Copley Plaza Hotel where all the sessions were held. Joint sessions 
were held with the American Statistical Association and the Econometric 
Society. The following 225 members of the Institute attended: 


Adam Abruzzi, F. 8. Acton, Beatrice Aitchison, J. E. Alman, R. L. Anderson, T. W. 
Anderson, F. W. Appel, H. E. Arnold, K. J. Arnold, Max Astrachan, J. C. Bain, T. A. Ban- 
croft, G. E. Bates, H. P. Beard, R. E. Bechhofer, Joseph Berkson, C. A. Bicking, C. I. 
Bliss, Isadore Blumen, C. A. Bodwell, Paul Boschan, R. C. Bose, R. A. Bradley, A. E. 
Brandt, M. F. Bresnahan, (Mrs.) Jean Bronfenbrenner, (Mrs.) Bernice Brown, T. H. 
Brown, M. A. Brumbaugh, T. A. Budne, R. W. Burgess, I. W. Burr, L. D. Calvin, B. H. 
Camp, G. A. Carlton, P. C. Clifford, W. G. Cochran, F. G. Cornell, Jerome Cornfield, 
L. M. Court, David Cowan, C. C. Craig, 8. L. Crump, J. F. Daly, G. B. Dantzig, B. B. Day, 
F. R. DelPriore, Lucile Derrick, Philip Desind, W. J. Dixon, J. L. Dolby, Robert Dorfman, 
H. F. Dorn, J. A. Dudman, A. J. Duncan, D. B. Duncan, C. W. Dunnett, David Durand, 
Solomon Dutka, A. M. Dutton, P. 8. Dwyer, J. 8. Elston, Benjamin Epstein, W. T. Federer, 
Robert Ferber, J. W. Fertig, C. H. Fischer, Evelyn Fix, L. R. Frankel, D. A. S. Fraser, 
H. A. Freeman, Milton Friedman, Edward Fritz, M. A. Girshick, Abraham Golub, L. A. 
Goodman, G. E. Gourrich, E. L. Green, B. G. Greenberg, 8. W. Greenhouse, T. N. E. Gre- 
ville, A. A. Grometstein, Frank Gross, H. O. Gulliksen, E. J. Gumbel, L. 8. Gunlogson, 
John Gurland, Margaret Gurney, R. K. Haddad, K. W. Halbert, Max Halperin, J. F. 
Hannan, M. H. Hansen, H. H. Harman, G. M. Harrington, Boyd Harshbarger, J. L. Hodges, 
Jr., R. G. Hoffman, W. C. Hood, H. B. Horton, D. G. Horvitz, R. H. Hoskins, E. E. House- 
man, H. 8. Houthakker, W. G. Howe, C. H. Hubbell, H. M. Humes, Leonid Hurwicz, EF. 
R. Immel, 8. L. Isaacson, J. E. Jackson, T. A. Jeeves, P. O. Johnson, H. L. Jones, A. FE. 
Karp, Leo Katz, H. J. Kelly, J. C. Kiefer, E. P. King, Jr., T. C. Koopmans, R. L. Kozelka, 
kK. H. Kramer, William Kruskal, Jack Laderman, D. H. Leavens, G. J. Lieberman, J. E 
Lieberman, R. E. Link, S. B. Littauer, G. F. Lunger, W. G. Madow, E. S. Marks, Jacob 
Marschak, P. J. McCarthy, G. E. McCreary, F. E. McIntyre, Paul Meier, Margaret Mer- 
rell, U. A. Metzner, M. F. Millikan, Albert Mindlin, Robert Mirsky, E. B. Mode, J. E. 
Morton, L. E. Moses, Jack Moshman, Frederick Mosteller, Hugo Muench, R. B. Murphy, 
N. R. Neifeld, C. J. Nesbitt, R. T. Nichols, H. Nisselson, G. E. Noether, M. L. Norden, 
H. W. Norton, E. G. Olds, Ingram Olkin, P. 8. Olmstead, Toby Oxtoby, W. R. Pabst, Jr., 
R. E. Patton, B. E. Phillips, Aditya Prakash, D. A. Probst, Frank Proschan, Howard 
Raiffa, L. J. Reed, P. R. Rider, H. M. Roberts, C. F. Roos, J. H. Roseboom, Irving Rosh- 
walb, S. N. Roy, Herman Rubin, Rose Sachs, M. M. Sandomire, M. A. Schneiderman, E. 
L. Scott, Esther Seiden, Jack Sherman, 8. 8. Shrikhande, Elizabeth Shuhany, I. H. Siegel, 
W. R. Simmons, W. B. Simpson, Rosedith Sitgreaves, F. H. Smith, J. H. Smith, Milton Sobel, 
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Herbert Solomon, F. A. Sorenson, Mortimer Spiegelman, M. D. Springer, J. P. Stanley, 
C. M. Stein, Joseph Steinberg, H. W. Steinhaus, F. F. Stephan, S. A. Stouffer, W. F. Taylor, 
Henry Teicher, Dan Teichroew, B. J. Tepping, J. R. Terrell, M. E. Terry, W. R. Thomp- 
son, R. M. Thrall, D. V. Tiedeman, J. W. Tukey, D. F. Votaw, Jr., J. E. Walsh, E. S. 
Weiss, Samuel Weiss, E. L. Welker, Phillips Whidden, A. G. Whitney, 8. 8. Wilks, Gerald 
Winston, Jacob Wolfowitz, M. A. Woodbury, C. A. Wright, Jacob Yerushalmy, William 
Youden, W. B. Zacharias. 


The meeting opened on Wednesday, December 26, 1:30 P.M. with the first 
session of contributed papers, Professor S. B. Littauer, Columbia University, 
presiding. The following papers were presented: 


1. Two Rank Order Tests Which Are Most Powerful against Specific Parametric Alterna- 
tives. Milton E. Terry, Jr., Virginia Polytechnic Institute. 

2. Partially Balanced Designs withk > r = 3,¥ = 1,42 = 0. R. C. Bose and W. H. 
Clatworthy, University of North Carolina. 

3. Some Observations on the F-test in Analysis of Variance. 8S. N. Roy, University of 
North Carolina 
The Neyman-Pearson Lemma Factor Functions. L. M. Court, American Power Jet 
Company, New York 
The Probability of a Correct Ranking. Preliminary Report. Robert E. Bechhofer, 
Columbia University. 
A Nonparametric Analogue Based upon Ranks of One-way Analysis of Variance. 
William Kruskal, University of Chicago. 

. A Series of Group Divisible Designs for Two-way Elimination of Heterogeneity. S. S. 
Shrikhande, University of Kansas. 

. A Test of the Uniformity of the Circular Distribution. Preliminary Report. J. A. Green 
wood, Manhattan Life Insurance Company, and David Durand, National Bureau 
of Economic Research 

. A Method for Limit Theorems in Markov Chains. (By title.) T. E. Harris, The Rand 
Corporation. 

. On Tests of Certain Hypotheses About Multivariate Normal Populations. (By title.) 
S. N. Roy, University of North Carolina. 

11. An Inequality for Orthogonal Arrays of Strength 2. (By title.) 8S. S. Shrikhande, Uni- 
versity of Kansas. 


At the second session, at 3:30 on the opening day, on Monte Carlo Methods, 
the following papers were given: 


1. Some Experiments on Monte Carlo Methods. John Todd, National Bureau of Standards. 
2. Empirical Sampling Distributions. Dan Teichroew, University of North Carolina. 


Discussion was by Professor R. L. Anderson, University of North Carolina. 
Professor Paul 8. Dwyer, University of Michigan, was chairman of the session. 

A session on Problems of Identifiability was held jointly with the American 
Statistical Association and the Econometric Society at 9:30 A.M. on Thursday, 
December 27, with Professor Robert Solow, Massachusetts Institute of Tech- 
nology, as chairman. The following three papers were presented: 


1. Identifiability and Consistent Estimability of Structural Relations. Jerzy Neyman, 
University of California, Berkeley. 

2. Systems of Nonlinear Stochastic Difference Equations. Herman Rubin, Stanford Uni- 
versity. 
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3. Necessary and Sufficient Conditions for Consistent Estimability. J. L. Hodges, Univer- 
sity of Chicago. 


Discussion was by Professor C. M. Stein, University of Chicago, Professor L. 
Hurwicz, University of Minnesota, and Professor T. A. Jeeves, University of 
California, Berkeley. 

The second session of contributed papers was held at 9:30 A.M. on Thursday, 
December 27, with Marion M. Sandomire, Atomic Energy Commission, New 
York, presiding. The following papers were presented: 


1. The Distribution of the Range in Samples from a Compound Normal Population. Ken- 
neth H. Kramer, Youngstown Sheet and Tube Company. 
2. On the Operating Characteristics of Certain Quality Control Tests. John E. Walsh, U. 
S. Naval Ordnance Test Station, Pasadena. 
3. Operating Characteristic of the Control Chart for Sample Means. Preliminary Report. 
Edgar P. King, Carnegie Institute of Technology. 
. Joint Sampling Distribution of the Mean and Standard Deviation for Frequency Func- 
tions of the Second Kind. Melvin D. Springer, U. S. Naval Ordnance, Indianapolis 
5. Statistical Theory of Droughts. E. J. Gumbel, Consultant, Stanford University. 
5. Some Tests based on the First r Ordered Observations Drawn from an Exponential Dis- 
tribution. Benjamin Epstein and Milton Sobel, Wayne University. 
. Some Theorems Relevant to Life Testing. Milton Sobel and Benjamin Epstein, Wayne 
University. 
3. A Method of Reducing the Time Required to Complete Certain Fatigue Tests. Leonard 
G. Johnson, General Motors Corporation, Detroit. 
On the Multivariate Poisson Distribution. Henry Teicher, Purdue University. 
. Formulas for Approximating the Hypergeometric and Binomial by th: Poisson Dis- 
tribution. Irving W. Burr, Purdue University. 
. Distribution of Ranges from an Arbitrary Discrete Universe. (By title.) Irving W. 
Burr, Purdue University. 


At 1:30 P.M., Thursday, December 27, a session on Statistical Models in 
Learning Theory was held, with Professor Harold Gulliksen, Educational Test- 


ing Service, Princeton, N. J., as chairman. The following papers were presented: 


1. A Linear-operator Model for Learning. Robert H. Bush and Frederick Mosteller, 
Harvard University. 

2. A Statistical Description for Verbal Learning. George A. Miller and William J. McGill, 
Massachusetts Institute of Technology. 

3. Application of a Set-theoretical Model to Learning Phenomena. Cletus J. Burke and 
W. K. Estes, Indiana University. 


Discussion was by Professor Max A. Woodbury, Princeton University, Pro- 
fessor J. C. R. Licklider, Massachusetts Institute of Technology and Professor 
Leo Goodman, University of Chicago. 

An invited address on The Spectrum View of the Analysis of Time Series was 
presented by Professor John W. Tukey, Bell Telephone Laboratories, at 
3:30 P.M., Thursday, December 27. Discussion was given by Professor Guy 
H. Orcutt, Harvard University, and Professor T. W. Anderson, Columbia 
University. Professor M. A. Girshick, Stanford University, was chairman of the 
session. 
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A session on Statistical Inference was held at 9:30 A.M., Friday, December 
28, Professor 8. S. Wilks, Princeton University, serving as chairman. The fol- 
lowing two papers were given: 

1. The Chi-square Test of Goodness of Fit. W. G. Cochran, Johns Hopkins University. 


2. Analyzing Straight Line Data. Forman 8. Acton, National Bureau of Standards, 
Institute of Numerical Analysis. 


Discussion was by Dr. William F. Taylor, School of Aviation Medicine, Ran- 
dolph Field, and Dr. W. J. Youden, National Bureau of Standards. 

The session, Sampling in the 1950 Census, was held jointly with the American 
Statistical Association at 1:30 P.M., Friday, December 28, with Morris H. 
Hansen, Bureau of the Census, presiding. The following two papers were given: 


1. Sampling in the 1950 Census of Population and Housing. Joseph Steinberg and Joseph 


Waksberg, with A. Mindlin, T. Jabine, N. Lieder and H. Hess, Bureau of the Census. 


2. Sampling in the 1950 Census of Agriculture. Harold Nisselson and Floyd Berger, 


Bureau of the Census. 


Discussion was by Professor Frederick F. Stephan, Princeton University, and 
Earl N. Houseman, Bureau of Agricultural Economics. 

At 3:30 P.M., Friday, December 28, there was a Round Table on Goals for 
the Study of Mathematics by Social Scientists held jointly with the American 
Statistical Association and the Econometric Society, Professor William G. 
Madow, University of Illinois, serving as moderator. The five speakers were 
Professors George A. Miller, Massachusetts Institute of Technology, Jacob 
Marschak, University of Chicago, Paul F. Lazarsfeld, Columbia University, 
Frederick F. Stephan, Princeton University, Rutledge Vining, University of 
Virginia; and discussants were members of the Committee on the Mathematical 
Training of Social Scientists. 

At 9:30 A.M., Saturday, December 29, the Institute joined with the American 
Statistical Association, the American Society of Actuaries, the Biometric So- 
ciety, and the American Association of University Teachers of Insurance in a 
session on Discrete Random Processes and Actuarial Theory, at which H. L. Seal 
of Morss and Seal gave an invited address on Probability Theory of Decrements 
from Population. There was discussion by Professor H. W. Alexander, Adrian 
College (represented by C. J. Nesbitt), Dr. T. N. E. Greville, National Office 
of Vital Statistics, and Dr. John E. Walsh, U. 8S. Naval Ordnance Test Station, 
Pasadena. Dr. Mortimer Spiegelman, Metropolitan Life Insurance Company, 
served as chairman. 

A session on Efficiency and Superefficiency of Estimates was held at 1:30 P.M., 
Saturday, December 29, with Dr. J. F. Daly, Bureau of the Census, presiding. 
The following four papers were presented: 


1. On the Problem of Asymptotic Efficiency of Estimates. Jerzy Neyman, University of 
California, Berkeley. 
2. Local Superefficiency. J. L. Hodges, University of Chicago. 
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. On Sets of Parameter Points where It Is Possible to Achieve Superefficiency of Estimates. 
Lucien M. Le Cam, University of California, Berkeley. 

. Relative Precision of Least Squares and Maximum Likelihood Estimates of Regression 
Coefficients. Joseph Berkson, Mayo Clinic. 


The third session of contributed papers was the closing session of the meeting, 
held at 3:30 P.M., Saturday, December 29. Professor Elmer B. Mode of Boston 
University served as chairman and the following papers were presented: 


1. Sufficient Statistics and Selection Depending on the Parameter. D. A. S. Fraser, Uni- 


versity of Toronto. 
2. On a Problem Suggested by Blackwell. Preliminary Report. Charles Stein, University 
of Chicago. 
3. On a Class of Infinitely Divisible Distributions. Gopinath Kallianpur, University of 
California, Berkeley, and Herbert Robbins, University of North Carolina. 
. Almost Sure Estimability of Linear Structures in n Dimensions. T. A. Jeeves, Uni- 
versity of California, Berkeley. 
5. Completeness of the Class of Admissible Decision Procedures. Herman Rubin, Stanford 
University. 
3. Moment-Problem Solutions with Continuous Derivatives. Lionel Weiss, University of 
Virginia. 
. Significance Consistency of the Basic Neyman-Fearson Test. (By title.) L. M. Court, 
American Power Jet Company, New York. 


Meetings of the Council were held on Thursday, December 27, and Friday, 
December 28, both at 5:40 P.M., with Professor Paul 8. Dwyer and Professor M. 
A. Girshick, respectively, presiding. The Business Meeting was held on Friday, 
December 28, at 8:30 A.M., with Professor Dwyer presiding. The report of this 
meeting appears elsewhere in this issue. 

S. B. Lrrraver, 
Associate Secretary 

Eimer B. Mops, 
Assistant Secretary 


(RR 


MINUTES OF THE ANNUAL MEMBERSHIP MEETING, 


BOSTON, DECEMBER 28, 1951 


The meeting was called to order at 8:35 A.M. in the Foyer of the Hotel 
Copley Plaza by President P. S. Dwyer. There were thirty-five members present. 

The annual reports of the President, Editor and Secretary-Treasurer were read. 
These are printed elsewhere in this issue. 

A report of the Committee on the Editorship was given by 8. 8S. Wilks with the 
information that the Council had approved the report and elected E. L. Lehmann 
as Editor for 1953-1955. 

The report of the Committee on Fellows was given. The names of the newly 
elected fellows are included in the report of the President. 
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The Program Coordinator announced that the following meetings are scheduled 
for the year 1952: 


Meeting Location Date 
Eastern Region Blacksburg, Virginia March 19-21 
Western Region Eugene, Oregon June 19-21 
Summer East Lansing, Michigan Sept. 2-6 
Annual Chicago, Illinois Dec. 27-29 


The tellers reported the election of the following: 


President-Elect M. H. Hansen 
Members of Council A. H. Bowker 
for 1952-1954 T. C. Koopmans 
H. E. Robbins 
H. G. Romig 


It was moved by F. C. Mosteller that the members of the Program Committee 
and other members of the Institute responsible for the Boston Meeting be con- 
gratulated on the success of their efforts. The motion was seconded and carried. 

The new President for 1952, M. A. Girshick, was introduced by retiring 
President P. 8. Dwyer. 

The meeting adjourned at 9:20 A.M. 

C. H. FiscHer 
Secretary 


OO 


REPORT OF THE PRESIDENT OF THE INSTITUTE FOR 
1951 


This has been a good year in many ways for the Institute. We have had three 
fine sectional meetings in addition to our two national meetings. The Annals 
continues to print excellent articles, new members have joined the Institute, 
and the financial condition has improved considerably. These results are due, 
of course, to the cooperation of many people including the officers, the members 
of the Council, the editors and contributors to the Annals, the associate and 
assistant secretaries, and the many members of the various committees who 
have carried out specific tasks so well. I first want to thank, on behalf of the 
Institute, every one who has assisted in making the year a success. 

I will not here list all the various members who have assisted the Institute 
during this year. A list of the various committees of the Institute was sent you 
early in 1951. A copy of this list, revised to date, is given as an appendix to this 
report. I do want to mention two committees which were authorized at the Min- 
neapolis Council meeting to do two specific tasks. 

A committee consisting of E. G. Olds, chairman; W. D. Baten, A. H. Bowker, 
P. 8. Olmstead, and D. F. Votaw, Jr., was appointed to study the question of 


the selection of a secretary-treasurer, since the term of our present secretary- 
treasurer expires next June. This committee has done fine work which will lead 





REPORT OF THE PRESIDENT 161 


soon, we expect, to the appointment of a new secretary-treasurer. A second 
committee consisting of 8. S. Wilks, chairman; Harold Hotelling, W. G. Cochran, 
and W. E. Deming, was appointed to study the question of a selection of an 
editor, since the term of our present editor expires December 31, 1952. This 
committee has completed its work and the Council has approved its choice, E. 
L. Lehmann, as editor during the 1953-1955 term. 

The work of the promotional committees should be mentioned. These are new 
committees and they were not appointed sufficiently early in the year so as to 
show startling results this year. However, the president for 1952 is keeping the 
promotional committees at work and their continuing efforts should show re- 
sults next year. 

The membership committee, under the leadership of Professor W. D. Baten, 
particularly should be commended, since there were at least 30 new members 
in 1951 who were directly attributable to the work of this committee. 

As president, I want particularly to express the appreciation of the Institute 
to the secretary-treasurer and to the associate and assistant secretaries for the 
fine work they have done during the year. Also I should mention the chairmen 
of the program committees, especially Professor Leo Katz, chairman for the 
Minneapolis meeting, and Dr. Joseph Daly, chairman for the annual meeting 
and program coordinator. 


The only change in the by-laws this year involved an adjustment in the dues 
for members who are students, and for those who are retired. 
Upon nomination of the Committee on Fellows, the Council elected the fol- 


lowing as Fellows of the Institute: 


F. J. Anscombe Evelyn Fix 

K. J. Arrow R. Fortet 

E. W. Barankin T. E. Harris 
Bruno DeFinetti Howard Levene 
Aryeh Dvoretzky Elizabeth Scott 


The Nominating Committee for 1952 will consist of: 


W. E. Deming, Chairman Elizabeth Scott 
C. A. Bennett P. R. Rider 
H. W. Norton 


Appendix. Committees of the Institute, 1951 
1. The Council and Committees of the Council 


(a) Members of the Council 
Term expires 1951 Term expires 1952 Term expires 1953 
W. G. Cochran David Blackwell Harold Cramér 
Churchill Eisenhart W. G. Madow A. M. Mood 
Harold Hotelling F. C. Mosteller Jerzy Neyman 
E. L. Lehmann L. J. Savage S. 8. Wilks 
(b) Executive Committee 
P. S. Dwyer, President C.H. Fischer, Secretary-Treasurer 
M. A. Girshick, President-Elect T. W. Anderson, Editor 
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(ec) Committee on Fellows 
M. A. Girshick, Chairman 
T. W. Anderson 
Churchill Eisenhart 


2. Committees Related to Program 


(a) Annual Meeting—Boston 
J. F. Daly, Chairman 
R. L. Anderson 
H. F. Dorn 
S. B. Littauer 
Minneapolis Meeting 
Leo Katz, Chairman 
D. H. Blackwell 
P. O. Johnson 
J. F. Kenney 
Eastern Region 
H. E. Robbins, Chairman 
R. A. Bradley 
I. D. J. Bross 
5S. L. Crump 
Central Region 
Oscar Kempthorne, Chairman 
D. A. Darling 
Leonid Hurwicz 
Western Region 
W. J. Dixon, Chairman 
Z. W. Birnbaum 
A. H. Bowker 
Plans for Coordination 
W. G. Madow, Chairman 
J. F. Daly 
J. W. Tukey 


) Program Coordinator 
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E. L. Lehmann 
Jerzy Neyman 
H. E. Robbins 


Michel Loéve 

F. C. Mosteller 

Joseph Steinberg 

E. V. Mode, Assistant Secretary 


Elizabeth Scott 

Gerhard Tintner 

A. W. Tucker 

A. E. Treloar, Assistant Secretary 


W.N. Hurwitz 
D. F. Votaw, Jr. 
M. A. Woodbury 


W. H. Kruskal 
K. O. May 
D. R. Whitney 


J. L. Hodges 
P. G. Hoel 
A. M. Mood 


J. F. Daly (The program coordinator is ex officio member of all program commit- 


tees.) 
Special Invited Papers 
Leo Katz, Chairman 
T. W. Anderson 
J. F. Daly 
W. J. Dixon 


. Promotional Committees 


(a) Membership 
W.D. Baten, Chairman 
C. R. Blyth 
D. B. DeLury 
Academic Institutional Members 
I. W. Burr, Chairman 
D. H. Blackwell 
(c) Non-Academic Institutional Members 
F. E. Grubbs 
C. C. Hurd 
Jack Sherman 


(b) 


O. Kempthorne 
H. E. Robbins 
8.8. Wilks 


T. N. E. Greville 
.E 


D. E. South 


L. A. Knowler 
A. E. Treloar 
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(d) Subscriptions 
Marion Sandomire, Chairman Leo Goodman 
Mary Elveback F. C. Leone 


4. Other Committees 


(a) Nominating Committee: appointed by 1950 President, J. L. Doob 
G. W. Brown, Chairman G. E. Nicholson 
K. J. Arrow H. W. Norton 
E. L. Lehmann J. W. Tukey 
Investigate Less Expensive Possibilities for Printing Annals 
T. W. Anderson, Chairman W. A. Wallis 
W. E. Deming S. S. Wilks 
Revive Statistical Research Memoirs 
Henry Scheffé, Chairman C. C. Hurd 
T. W. Anderson 
Rietz Lecture Committee 
J. Neyman, Chairman Will Feller 
C. C. Craig 
Rietz Lecturer, 1951 
Harold Hotelling 
Special Membership Rates 
P. R. Rider, Chairman C. H. Fischer 
K. J. Arnold E. G. Olds 
B. H. Camp 
Standards for Statisticians in Government Service 
W. E. Deming, Chairman Churchill Eisenhart 
H. F. Dorn B. J. Tepping 
Tabulation 
C. C. Craig, Chairman A. H. Bowker 
K. J. Arnold Evelyn Fix 
Wald Memorial 
Howard Levene, Chairman kK. L. Lehmann 
T. W. Anderson W. G. Madow 
Committee on Secretary-Treasurership 
E. G. Olds, Chairman P. S. Olmstead 
W. D. Baten D. F. Votaw, Jr. 
A. H. Bowker 
Committee on Editorship 
8S. S. Wilks, Chairman W. E. Deming 
W. G. Cochran Harold Hotelling 


5. Representatives of the Institute for 1961 


(a) To the American Association for the Advancement of Science 
Jerzy Neyman 

(b) To the National Research Council, Division of Physical Sciences 
Walter Bartky (to June, 1951) 
8S. S. Wilks (from July, 1951) 
To the Mathematical Policy Committee 
Henry Scheffé 


To the Joint Committee for Development of Statistical Applications in Engineering 
and Manufacturing 


Benjamin Epstein 
(e) To the American Academy of Political and Social Science 
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B. J. Tepping 
F. F. Stephan 
(f) On the Committee on the Teaching of Mathematics for Statistics and the Social Sciences 
W. G. Madow 
T. W. Anderson 
Paut S. Dwyer 


President 
December 28, 1951 


a 


REPORT OF THE SECRETARY-TREASURER OF THE 
INSTITUTE FOR 1951 


At the beginning of 1951 the Institute had 1239 members and during the 
period covered by this report 135 new members (1 of whom was appointed to 
membership by the new Institutional Member, Raytheon Manufacturing Co., 
Newton, Massachusetts) joined the Institute and 3 were reinstated. During 
1951 the Institute lost 149 members, of which 52 were by resignation, 93 were 
cancelled for nonpayment of dues, and 4 were deceased. Judging from the in- 
formation available at this date, the Institute will have 1228 members as it 
starts 1952. 

A list of the meetings of the Institute held during 1951 is given below, together 
with the names of the assistant and associate secretaries and program committee 
chairmen directly responsible. 


Assistant Associate 
Date Place Secretary Secretary Program Chairman 


March 15-17 Oak Ridge, Tenn. Jack Moshman K.J. Arnold Oscar Kempthorne 
June 16 Santa Monica, Calif. A.M. Mood J.L. Hodges W. J. Dixon 

Sept. 3-7 Minneapolis, Minn. A. E. Treloar K.J. Arnold Leo Katz 

Oct, 26-27 Washington, D.C C. E, Eisenhart 8S. B. Littauer C. E. Eisenhart 
Dec. 26-2 Boston, Mass. E. B. Mode 8. B. Littauer J. F. Daly 


On behalf of the Institute, the Secretary wishes to express appreciation to all 
of these men and to the other members of the program committees for making 
each of these meetings a success. Dr. J. F. Daly did an excellent job as Program 
Coordinator and ex officio member of all program committees. 


INSTITUTE OF MATHEMATICAL STATISTICS 
Statement of Condition 
December 31, 1951 
ASSETS 


Bank $10,484.27 
Dues Receivable 134.00 
Subscriptions Receivable 937 . 57 
U.S. Government Bonds ; ‘ 4,888.00 


Total Assets ; ; ae ‘ $16 , 443.84 
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LIABILITIES AND RESERVES 


Amount Due Printer for December Issue (Estimate) gah $2,631.79 
Withholding Tax Payable cagatg 123.00 
Amount Due for Reprinting Back Issues 1,005.70 
Miscellaneous Liabilities 76.76 $ 3,837.25 
Reserve for Dues Advanced ; $1,976.00 
Reserve for Subscriptions Advanced anti 55 2,937.45 
Reserve for Life Members is ai tas 6a) 2,757.50 
Reserve for Biometrika vir : 84.50 7,755.45 
Total Liabilities and Reserves $11,592.70 
Surplus* (Excess of Assets over Liabilities) eRe Mees 4,851.14 
$16,443.84 
* Surplus is not adjusted for inventory of back issues estimated at a nominal value of 
$19,307.39 (67¢ per issue). 


Revenue and Expense Statement 
For the year ending December 31, 1951 


Revenues 
Dues Revenue : s all ; ‘ $12,025.75 
Subscriptions Revenue : 5,461.21 
Sale of Back Issues : ; ; 4,628.12 
Interest Earned on Bonds : 100.00 
Miscellaneous Revenue 46.54 $22,261.62 


Expenses 
Printing of Annals Current $10,830.43 
Reprinting of Back Issues 2,386.11 
Salary Expense 3,187.50 
Miscells.neous Printing of Stationery & Postage 1,816.82 
Contributions to American Math. Society 196. 26 
Miscellaneous Office Expense 286.40 
Editorial Expense nellinnenticinnins 200.00 
Meeting Expense 104.07 
Binding Expense ' 73.00 
Travelling Expense ; 172.68 
President’s Fund ; 18.04 


Excess of Revenues over Expenses ; . $ 2,990. 
Excess of Assets over Liabilities December 31, 1950 1,860 


Excess of Assets over Liabilities December 31, 1951 ; $ 4,851.14 


It has been our practice to set up an amount equal to all life membership 
payments as a liability and to hold all these funds in reserve until the death of 
the member—after which his payment is released to the general fund. There 
were no new life membership payments during 1951 nor were there any deaths 
among life members. The total number of members therefore remains as 33. 

The increases in dues and subscription prices which became effective on Jan- 
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uary 1, 1951, rescued the Institute from its position of being dependent upon 
the unpredictable sale of back issues for its solvency. Had dues and subscription 
revenues remained at the 1951 level, there would have been a net operating loss 
for the year. Under present conditions, there will be some expansion possible 
in the size of the Annals 

As was anticipated in the report of last year, the sale of back issues in 1951 
declined somewhat and the cost of reprinting those issues which were in very 
short supply increased. In 1950, the difference between the sale of back issues 
and the reprinting expense was $4,152.61, while in 1951, this difference dropped 
to $2,242.01. At the present time, our stock of back numbers is adequate, there 
being no issue of which we have fewer than 100 copies. 

It will be noted that at long last the Institute is in a favorable cash position. 
One should not lose sight of the fact, however, that a large portion of this im- 
provement is due to the phenomenal sale of back issues during the past two 
years, and that the increased level of dues is required to maintain our current 
solvent position. 

At the September, 1951, meeting of the Council, E. S. Pearson was appointed 
an Associate Treasurer with the primary responsibility for collecting dues from 
the members of the British Isles. 

During 1952 a new secretary will take office. This change will necessitate a 
considerable expenditure in moving the office, training a new office manager 
and, if the recommendation of the special committee on the secretary-treasurer- 
ship is adopted by the Council, the shipping of the present stock of back issues 
to the Waverly Press warehouse in Baltimore. Fortunately, our income should 
be ample to take care of all of these expenditures for 1952. 

Cari H. FiscHer 
Secretary-Treasurer 
December 28, 1951 


REPORT OF THE EDITOR OF THE ANNALS FOR 1961 


The 1951 volume of the Annals contained 72 papers of which 27 were short 
notes. This number of papers is by far the largest published in any volume of 
the Annals. In addition to these research articles, the usual abstracts, reports 
of meetings, and news and notices were printed, bringing the number of pages 
of the 1951 volume to 621. 

During the past year the backlog of papers awaiting publication in the Annals 
has decreased considerably. As a result, the average time between submission 
and publication of a paper has been reduced. One of the main reasons for the 
change is that a large number of papers which would ordinarily have been sub- 
mitted to the Annals were presented at the second Berkeley Symposium and 
have been published in the Proceedings of that symposium. While the pressure 
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on the Annals facilities has decreased, the causes are probably temporary and 
it is wise to be prepared for an increase again in the rate of submission of papers. 

About a year ago Abraham Wald and Mrs. Wald suddenly and tragically 
lost thefr lives. The loss to mathematical statistics is felt particularly in the 
pages of the Annals to which Wald was the most prolific contributor. The 
memory of Wald is honored in the 1952 volume, especially in the March issue, 
which contains the three papers presented at the Wald Memorial Session at 
the Minneapolis meetings in September, 1951. 

On behalf of the Editorial Committee, the Editor takes this opportunity to 
acknowledge the generous refereeing assistance of the following: R. L. Anderson, 
K. J. Arrow, Edward Barankin, Robert Bechhofer, Albert Bowker, D. G. 
Chapman, Herman Chernoff, D. A. Darling, W. J. Dixon, H. F. Dodge, Churchill 
Eisenhart, Evelyn Fix, D. A. 8. Fraser, Leo Goodman, John Gurland, Oscar 
Kempthorne, Jack Kiefer, Roy Leipnik, F. J. Massey, Jr., M. R. Mickey, George 
Nicholson, Gottfried Noether, Edward Paulson, R. P. Peterson, Melvin Peisakoff, 
H. G. Romig, Herman Rubin, Herbert Ryser, Elizabeth Scott, L. J. Snell, D. F. 
Votaw, Jr., J. E. Walsh, Ransom Whitney. 

The Editor is indebted to Jacob Horowitz, Roy Kuebler and Jack Laderman 
for preparation of manuscripts for the printer and to Mrs. R. Murphy and 
Misses Gerda Ubel and Maria Vecchione for other editorial and office assistance. 

T. W. ANDERSON 
Editor 
December 28, 1951 
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PUBLICATIONS RECEIVED 


Anuario de Comercio Exterior, Seccién de Publicaciones de la Contraloria General de la 
Reptblica, Bogota, 1951, xlviii + 740 pp. 

Rospert DorrMan, Application of Linear Programming to the Theory of the Firm, Univer- 
sity of California Press, Berkeley and Los Angeles, 1951, ix + 90 pp., $3.50. 

Recenseamento Geral do Brasil (1° de Setembro de 1940), Série Regional, Parte X 1I1I1—Minas 
Gerais, Tomo 1, Censo Demogrdfico, Servigo Grdfico, Instituto Brasileiro de Geografia 
e Estatistica, Rio de Janeiro, 1950, xxxi + 243 pp. 

Recenseamento Geral do Brasil (1° de Setembro de 1940), Série Regional, Parte XVII—Séo 
Paulo, Tomo 1, Censo Demogréfico, Servigo Grdfico, Instituto Brasileiro de Geografia e 
Estatistica, Rio de Janeiro, 1950, xxx + 243 pp. 

Recenseamento Geral do Brasil (1° de Setembro de 1940), Série Nacional, Volume IT, Censo 
Demografico, Servigo Grd&fico, Instituto Brasileiro de Geografia e Estatistica, Rio de 
Janeiro, 1950, xxxii + 181 pp. 

Monte Carlo Method, Applied Mathematics Series 12, (National Bureau of Standards) U. 8. 
Government Printing Office, Washington, D. C., 1951, vii + 42 pp., $0.30. 

Problems for the Numerical Analysis of the Future, Applied Mathematics Series 15, (National 
Bureau of Standards) U. 8. Government Printing Office, Washington, D. C., 1951, 
iv + 21 pp., $0.20. 

Tables of n! and T'(n + 4) for the First Thousand Values of n, Applied Mathematics Series 16, 
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(National Bureau of Standards) U.S. Government Printing Office, Washington, D. C., 
1951, iii + 10 pp., $0.15. 

Tables to Facilitate Sequential t-Tests, Applied Mathematics Series 7, (National Bureau of 
Standards) U. 8. Government Printing Office, Washington, D. C., 1951, xix + 82 pp., 


$0.45. 


Stxro Rios, Introduccion a los Metodos Estadisticos (1* parte), 1. Diez, Madrid, 1951, 205 


pp. 
Donawp SratLerR Vitiars, Slatistical Design and Analysis of Experiments for Development 


Research, Wm. C. Brown Co., Dubuque, Iowa, 1951, ix + 455 pp., $6.50. 





ESTADISTICA 


Official Journal of the Inter American Statistical Institute 


Vol. IX, No. 32 September 1951 
Contents 


Estadisticas de las Finanzas Pablicas: Las Funciones del Presupuesto de los Gobi- 
ernos Centrales 


Notas sébre o Levantamento de Dados Bio-Estatisticos na Amazonia Brasi- 
_ ee seein se che matec at < aaa de kn ACHILLEs SCORZELLI JR. 


Renta Nacional ; .LorETO DOMINGUEZ 


E] Problema de la Suficiencia de Pagos y de la Seguridad de Empleo para los Esta- 
disticos en las Agencias Oficiales de la América Latina 


Quelques Observations sur L’Assimilation Linguistique des Immigrés au Brésil et de 
Leurs Descendants... . eer nse ean GIORGIO MORTARA 


Informes sobre la I Sesién de la Comisién de Mejoramiento de las Estadisticas 
Nacionales, y sobre la IV Sesién de la Comisién del Censo de las Américas de 
1950, Washington, D. C., June 2-15, 1951. 


Editorial Notes. Institute Affairs. Statistical News. Publications. 


Editor: Francisco de Abrisqueta 
Inter American Statistical Institute, % Pan American Union, Washington 6, D.C., U.S. A. 


JOURNAL OF THE March 1952 
AMERICAN STATISTICAL ASSOCIATION Vol. 47 No. 257 


1108 16th Street, N. W., Washington 6, D. C. 


Estimation for Sub-Sampling Designs Employing the County as a Primary Sampling 

[vege Emu H. Jese 
Some Applications of Statistics for Auditing... . rv" Joun NETER 
Latent Structure Analysis and its Relation to Factor Analysis .......BERT F, Green, Jr. 
Fertility Trends and Differentials in the U.S............ Crype V. Kiser 
REPRINTS OF ABSTRACTS IN STATISTICAL METHODOLOGY 


BOOK REVIEWS 


The American Statistical Association invites as members all per- 


sons interested in: 
1. development of new theory and method 
2. improvement of basic statistical data 
3. application of statistical methods to practical problems. 


§ 
f 
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BIOMETRIKA 
A Journal for the Statistical Study of Biological Problems 


Volume 38 Contents Parts 3 and 4, December 1951 


1. Biometrika 1901-1951. By W.P. ELDERTON. 2. Jacobians of certain matrix transformations useful in 
Multivariate analysis. By W. L. DEEMER and I. OLKIN. 3. A chart for the incomplete Beta function 
and the cumulative binomial distribution. By H. O. HARTLEY and E. R. FITCH. 4. The effect of 
standardization on a x? approximation in factor analysis. By M. S. BARTLETT. 5. Some systematic 
experimental designs. By D. R. COX. 6. On estimating the size of mobile populations from recapture 
data. By N.T.J. BAILEY. 7. The comparison of several groups of observations when the ratios of the 
population variances are unknown. By G. 8. JAMES. 8. On the comparison of several mean values: an 
alternative approach. By B. L. WELCH. 9. Tables of symmetric functions: Pts. II and III. By F. N. 
DAVID and M.G. KENDALL. 10. A mathematical theory of animal trapping. By P. A. P. MORAN. 
11. Two applications of bivariate k-statistics. By B. M. COOK. 12. Expected frequencies in a sample of 
an animal population in which the abundances of species are lognormally distributed: Pt. I, Theory; Pt. II, 
Application. By P.M.GRUNDY. 13. The fitting of polynomials to equidistant data with missing values. 
By H.O. HARTLEY. WM. The delay to pedestrians crossing a road. By J.C. TANNER. 15. Interrela- 
tions between certain linear systematic statistics of samples from any continuous population. By G. P. 
SILLITTO. 16. Truncated log-normal distributions: I. Solution by moments. By H. R. THOMPSON. 
17. Further applications of range to the analysis of variance. By H. A. DAVID. 18. The estimation of 
population parameters from data obtained by means of the capture-recapture method: I. The maximum 
likelihood equation for estimating the death rate. By P. H. LESLIE and DENNIS CHITTY. 19. MIS- 
CELLANEA. 20. REVIEWS. 


The subscription price, payable in advance, is 45s. inland, 54s. export (per volume including postage). Cheques 
should be cave to Biometrika and sent to “The Secretary, Biometrika Office, Department of Statistics, 
University College, London, W.C. 1.” All foreign cheques must be in sterling and drawn on a bank 
having a London agency. 


ECONOMETRICA 


Journal of the Econometric Society 
Contents of Vol. 20, January, 1952, include: 


Hous B. CHENERY .........Overeapacity and the Acceleration Principle 
Ropert Soitow.......... On the Structure of Linear Models 
JANET A. FISHER.. .Postwar Changes 

In Income and Savings Among Consumers in Different Age Groups 
Georce H. Borts.... Production Relations in the Railway Industry 
Report of the Santa Monica Meeting, August 2-4, 1951. Report of the Minneapolis 
Meeting, September 4-7, 1951. Report of the Council for 1951, Treasurer’s Report, 
Election of Fellows, 1951. Book Reviews. 


Published Quarterly Subscription rates available on request 


The Econometric Society is an international society for the advancement of economic theory in its 
relation to statistics and mathematics 


Subscriptions to Econometrica and inquiries about the work of the Society and the procedure in applying 
for membership should be addressed to William B. Simpson, Secretary, The Econometric Society, The 
University of Chicago, Chicago 37, Illinois, U. 8. A. 





MATHEMATICAL REVIEWS 


A journal containing reviews of the mathematical liter- 
ature of the world, with full subject and author indices 


Publication of this journal is sponsored by the American Mathe- 
matical Society, Mathematical Association of America, Institute of 
Mathematical Statistics, London Mathematical Society, Edinburgh 
Mathematical Society, Union Matematica Argentina, and others. 


Subscriptions accepted to cover the calendar year only. 
Issues appear monthly except July. $20.00 per year. 
Send subscription order or request for sample copy to 


AMERICAN MATHEMATICAL SOCIETY 
80 Waterman Street, Providence 6, Rhode Island 











JOURNAL OF THE 
ROYAL STATISTICAL SOCIETY 


Series B (Methodological) 
Contents of Volume 13, No. I, 1951 


G. E. P. Box anv K. B. Witson 
On the ea Attainment of Optimum Conditions. (With Discussion) 
G. A. BARNARD .... .....The Theory of Information. (With Discussion) 
F. BENSON AND D. R. Cox 
The Productivity of Machines Requiring Attention at Random Intervals 
L. Fox anv J. G. Hayes More Practical Methods for the Inversion of Matrices 
S. RUSHTON 
On Least Square Fitting by Orthonormal Polynomials Using the Choleski Method 
BaRNET WOOLF Computation and Interpretation of Multiple Regressions 
M. P. ScHUTZENBERGER 
An Extension Problem in the Theory of Incomplete Block Designs 
H. R. THomson anv I. D. Dick 
Factorial Designs in Small Blocks Derived from Orthogonal Latin Squares 
ALLADI RAMAKRISHNAN... os ‘ Some Simple Stochastic Processes 
P. A. P. Moran.... Estimation Methods for Evolutive Processes 
P. A. P. Moran.... ; The Random Division of an Interval—Part II 


The Royal Statistical Society, 4, Portugal Street, London, W.C.2. 








SK ANDINAVISK 
AKTUARIETIDSKRIFT 


1951 = Parts 1 - 2 
Contents 


HarRALpD BERGSTROM On Asymptotic Expansions of Probability Functions 
Epwarp W. BARANKIN 


Concerning Some Inequalities in the Theory of Statistical Estimation 
ManrtTIN SANDELIUS.. Truncated Inverse Binomial Sampling 


Knot Mepn ..... A Function for Smoothing Tables of the Duration of Sickness 
MartTIn WEIBULL 


The Regression Problem Involving Non-random Variates in the Case of 
Stratified Sample from Normal Parent Populations with Varying Regression 


Coefficients 
K.-G. HaGsTROEM.... .....Erik Stridsbergt 


Annual subscription: 10 Swedish Crowns (Approx. $2.00). 
Inquiries and orders may be addressed to the Editor, 


SKARVIKSVAGEN 7, DJURSHOLM (SWEDEN) 


SANKHYA 
The Indian Journal of Statistics 
Edited by P. C. Mahalanobis 


Vol. 11, Part 1, 1951 


In Memoriam: Abraham Wald 
On the Realization of Stochastic Processes by Probability Distributions in 


Function Spaces........ cn satel aniiae Henry B. Mann 
A Theorem in Least Squares................. : .C. R. Rao 


On Type B, and Type B Regions... eee ...H. K. Nano 
Sone Notes on Ordered Samples from a Norms al Popul: ation .. .K. C. S. Pruuar 
Some Exponential Forms for Topographic Correlation BIRENDRANATH GHOSH 
On the Orthogonal Polynomials Associated with Student’s Distribution. 


A. 8. KrIsHNAMOORTHY 
A Multivariate Gamma-Type Distribution . V. K. RAMABHADRAN 


A Study on Differences in Physical Development by Socio-Economic Strata. 
RAMERISHNA MUKHERJEE 
U.N. Commission on Statistical Sampling—Report. 


Annual subscription: 30 rupees 
Inquiries and orders may be addressed to the 
Editor, Sankhy&, Presidency College, Calcutta, India. 











